Epistemic Dashboard

The Epistemic Dashboard gives you a cross-session view of the causal quality of your work. Where individual surfaces (Causal Workbench, Hybrid Synthesis, Legal Causation) focus on producing outputs, the Epistemic Dashboard focuses on evaluating them — tracking whether your reasoning, your evidence base, and your claims meet the standards of rigorous causal science. Navigate to the Epistemic Dashboard by selecting Epistemic in the sidebar, or by opening /epistemic directly.

What it monitors

The dashboard aggregates signal from all your Wu-Weism sessions into four areas:

Scientific evidence tracking: numeric and qualitative evidence surfaces across your sessions
Alignment audit reports: how your causal reasoning aligns with best practices at each rung of Pearl’s ladder
Spectral health monitor: a real-time view of causal health across your active and completed work
Benchmark results: your performance against scientific integrity benchmarks

Each area draws from the same underlying provenance graph that governs your claims — so the dashboard reflects actual work, not a separate evaluation layer.

Scientific evidence tracking

The evidence panel aggregates all numeric and qualitative evidence that has been extracted from your sessions — PDFs analyzed via the Causal Workbench, Hybrid Synthesis, or PDF Synthesis. For each piece of evidence the panel shows:

Source: the document or session the evidence came from
Evidence class: one of three labels (see below)
Claim linkage: which claims in the Claim Ledger this evidence supports
Extraction timestamp: when the evidence entered the system

Evidence class labels

Label	Meaning
`bibliographic/structural only`	The source contains references or structural content but no extractable quantitative evidence. Claims derived from this source carry higher uncertainty.
`mixed`	The source contains some numeric content alongside qualitative discussion. Evidence may support claims at Rung 1 but may not be sufficient for Rung 2 assertions without additional validation.
`metric-bearing`	The source contains well-formed quantitative evidence — effect sizes, confidence intervals, p-values, measured quantities. Sufficient to support Rung 2 claims and, with appropriate SCM specification, Rung 3 inference.

Filtering the evidence panel by class helps you quickly identify where your evidence base is thin and where it is robust.

Alignment audit reports

An alignment audit report is a structured evaluation of a session or synthesis run against causal best practices. Reports are generated automatically for each completed session and are available in the Audit Reports tab of the dashboard. Each report evaluates:

Rung consistency: whether Rung 2 and Rung 3 claims are supported by evidence of the appropriate class
SCM coverage: whether the causal variables asserted in responses are present in the loaded Truth Cartridge
Assumption explicitness: whether the assumptions underlying each causal claim were stated or remained implicit
Counterfactual validity: for Rung 3 claims, whether the counterfactual world was specified with sufficient precision

Reports are scored on a 0–100 alignment scale. A score below 70 typically indicates that claims were made at a higher rung than the evidence supports — a common issue when qualitative sources are used to ground quantitative causal assertions.

Alignment audit reports can be exported as PDF or JSON. The JSON format is structured for programmatic ingestion into review workflows or institutional reporting systems.

Spectral health monitor

The spectral health monitor provides a continuous, cross-session view of causal health as a set of metrics. Unlike the per-session audit report, the spectral monitor aggregates across all your work and updates as new sessions complete. Metrics tracked:

Metric	Description
Rung distribution	The proportion of your claims at Rung 1, 2, and 3. A healthy distribution for most research contexts skews toward Rung 2 with a smaller Rung 3 component. Heavy Rung 1 concentration may indicate under-specified causal questions.
Evidence coverage ratio	The fraction of your Rung 2+ claims that are supported by metric-bearing evidence. Low coverage indicates claims that outrun the evidentiary base.
SCM coherence score	How consistently your session outputs stay within the constraints of the loaded SCMs. High coherence means your questions and the model’s responses are well-matched to the domain cartridge.
Claim stability	Whether recorded claims have been revised or retracted after initial recording. High instability may indicate poorly specified questions or volatile evidence sources.

The monitor displays each metric as a time-series over your session history, so you can observe trends rather than just point-in-time values.

Benchmark results

The benchmarks tab shows your performance against a set of scientific integrity reference points. These benchmarks test whether your causal work meets the standards of reproducible, falsifiable causal science. Benchmarks evaluated:

Falsifiability benchmark

Evaluates whether your recorded claims are stated in a way that admits of empirical refutation. Claims expressed in vague or unfalsifiable language score low. A passing claim must specify: (1) the causal variable, (2) the outcome variable, (3) the direction of the effect, and (4) the conditions under which the claim holds.

Provenance completeness benchmark

Evaluates whether every recorded claim can be traced to a source — a session, a document, and an extraction event. Orphaned claims (present in the Claim Ledger without traceable provenance) fail this benchmark.

Uncertainty disclosure benchmark

Evaluates whether claims that carry uncertainty labels (from PDF Synthesis or Hybrid Synthesis) have had those labels preserved in the Claim Ledger entry. Stripping uncertainty labels when recording claims is a governance failure.

Intervention-observation separation benchmark

Evaluates whether Rung 2 claims are clearly distinguished from Rung 1 claims in your session outputs and recorded claims. Conflating observational findings with interventional conclusions is the most common epistemic error in applied causal analysis.

Benchmark results are updated after each session completes. Historical benchmark scores are retained so you can track improvement over time.

Claim Ledger

The governed record of claims that feeds the Epistemic Dashboard.

PDF Synthesis

Understand how evidence class labels are assigned during document analysis.

Causal Ladder

The three-rung framework underpinning alignment audit scoring.

Hybrid Synthesis

Multi-source synthesis whose outputs feed alignment audit reports.

​What it monitors

​Scientific evidence tracking

​Evidence class labels

​Alignment audit reports

​Spectral health monitor

​Benchmark results

​Related pages

Claim Ledger

PDF Synthesis

Causal Ladder

Hybrid Synthesis

What it monitors

Scientific evidence tracking

Evidence class labels

Alignment audit reports

Spectral health monitor

Benchmark results

Related pages