Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wuweism.com/llms.txt

Use this file to discover all available pages before exploring further.

PDF Synthesis is a focused extraction surface: give it one document, and it returns a structured breakdown of the quantitative evidence inside it and the causal claims that evidence can support. Where Hybrid Synthesis is designed for cross-document conflict and novelty, PDF Synthesis is designed for depth — extracting everything causally relevant from a single source and presenting it in a form that feeds directly into the Claim Ledger. Navigate to PDF Synthesis by selecting PDF Synthesis in the sidebar, or by opening /pdf-synthesis directly.

Uploading a document

1

Open the dropzone

The PDF Synthesis surface opens with a dropzone as its primary interface. You will see a large upload area in the center of the screen.
2

Drop or select your PDF

Drag your PDF file onto the dropzone, or click Choose file to browse your filesystem. Only .pdf files are accepted.
3

Wait for processing

PDF Synthesis runs extraction automatically after upload. A progress indicator shows parsing and analysis stages as they complete. Extraction time depends on document length and numeric density.
4

Review the three-section output

When extraction completes, the output panel populates with three structured sections (described below).
PDF Synthesis processes one document per run. If you need to synthesize across multiple documents simultaneously, use Hybrid Synthesis instead.

The three-section output

Every completed PDF Synthesis run returns the same three-section structure. This contract is fixed — you can rely on it for downstream processing or integration into reporting workflows.

Section 1 — All explicit numbers

Section 1 is an exhaustive inventory of every number that appears in the document in a causal or scientific context. This includes:
  • Measured quantities with units
  • Effect sizes and odds ratios
  • Sample sizes and study parameters
  • Percentages, rates, and proportions
  • Confidence intervals and standard errors
  • p-values and test statistics
  • Dates and durations relevant to causal timing
Each entry records the number, its unit (if applicable), the sentence it appeared in, and the page or section location in the source document. Section 1 is intentionally exhaustive and unfiltered. Its purpose is to ensure nothing is missed before the filtering step in Section 2.

Section 2 — Claim-eligible numerics

Section 2 is a filtered subset of Section 1: only those numbers that could support a causal claim. A number is claim-eligible if it:
  • Represents an effect, association, or outcome (not merely a reference or label)
  • Has sufficient context to identify what was measured and under what conditions
  • Could, in combination with a causal model, support a directional assertion about a relationship between variables
Each entry in Section 2 carries an evidence class label:
LabelWhat it means for this number
bibliographic/structural onlyThe number appears in a citation, reference list, or structural element. It cannot be used as primary evidence for a claim.
mixedThe number is present in the body but lacks full measurement context (no reported conditions, missing units, unclear operationalization). Usable with caution at Rung 1.
metric-bearingThe number is a well-formed measurement with sufficient context to support Rung 2 claims and, with appropriate SCM framing, Rung 3 inference.
Use Section 2 to quickly assess whether a document’s evidence base can support the causal claims you want to make. A document that returns mostly bibliographic/structural only labels in Section 2 is a thin evidentiary source regardless of how substantial it appears.

Section 3 — Three claims with uncertainty labels

Section 3 is the synthesis output: three candidate causal claims derived from the Section 2 numerics, each labeled with an uncertainty level. Each claim includes:
  • Claim statement: a falsifiable causal assertion grounded in the document’s evidence
  • Supporting evidence: the specific Section 2 numerics that underpin the claim
  • Causal rung: Rung 1, 2, or 3, reflecting the epistemic type of the claim
  • Uncertainty label: one of low, moderate, or high, reflecting how confidently the evidence supports the claim
The three claims are not exhaustive — they represent the three best-supported and most causally specific claims the extraction engine can construct from the document. They are starting points for investigation, not final conclusions.
Uncertainty labels in Section 3 must be preserved when recording claims to the Claim Ledger. Stripping uncertainty labels is a governance failure and will cause the Epistemic Dashboard’s uncertainty disclosure benchmark to fail.

How results feed into the Claim Ledger

After Section 3 is generated, each claim can be recorded to the Claim Ledger by clicking Record claim next to it. The recorded entry carries:
  • The claim statement
  • The uncertainty label
  • The source document (filename and extraction timestamp)
  • The supporting Section 2 numerics as provenance
Claims recorded from PDF Synthesis are traceable to their source document at the numeric level — not just to the document as a whole. This fine-grained provenance is what distinguishes governed claims from informal notes.

Difference from Hybrid Synthesis

PDF Synthesis and Hybrid Synthesis share extraction infrastructure but serve different purposes:
PDF SynthesisHybrid Synthesis
InputOne PDFUp to 6 PDFs and/or 5 company names
GoalDeep extraction from a single sourceCross-source conflict resolution and novelty
OutputThree-section structured extractionNovel hypotheses with confidence scores
Claim recordingManual (click to record)Automatic for top claim; manual for others
Best forAuditing a single document’s evidentiary contentFinding what multiple sources collectively imply
If you have a single paper and want to know what causal claims it can support, use PDF Synthesis. If you have a corpus and want to know what those sources collectively imply beyond their individual conclusions, use Hybrid Synthesis.

Hybrid Synthesis

Multi-source synthesis for reconciling conflicting claims across a corpus.

Claim Ledger

Understand how recorded claims are governed and exported.

Epistemic Dashboard

Track evidence class distribution and uncertainty disclosure across your work.

Causal Workbench

Use extracted claims as starting points for SCM-grounded causal dialogue.