Skip to main content

Documentation Index

Fetch the complete documentation index at: https://wuweism.com/llms.txt

Use this file to discover all available pages before exploring further.

The hybrid synthesize endpoint runs a multi-stage pipeline that ingests PDFs and company profiles, extracts causal entities and relationships, synthesises them into novel research ideas, and applies a novelty gate before returning a structured synthesis result. The pipeline streams progress as Server-Sent Events (SSE), so your client receives incremental updates as each stage completes.
Authentication is optional. If you include a valid Authorization header, results are associated with your account and persisted in your session history. Without a token, the synthesis runs anonymously and is not saved.

Run a synthesis

POST /api/hybrid-synthesize
  • Content-Type: multipart/form-data
  • Response: text/event-stream (SSE)
  • Max duration: 300 seconds (5 minutes)

Minimum inputs

You must supply at least 2 total sources across files and companies combined. For example: 1 PDF + 1 company, or 2 PDFs, or 2 companies.

Form data parameters

files
File[]
PDF files to include in the synthesis. Maximum 6 files. Each file must be a valid PDF. Files are parsed, scientifically analysed, and their causal entities are extracted before synthesis.
companies
string
JSON-encoded array of company names to include as intelligence sources. Maximum 5 companies. The pipeline retrieves recent research and product context for each company and incorporates it alongside your PDFs.Example: '["DeepMind", "Anthropic", "Meta AI"]'
researchFocus
string
A guiding question or theme that orients the synthesis. The pipeline uses this focus to rank ideas and structure the output.Example: "causal mechanisms of neuroplasticity under chronic stress"
enableParallelRefinement
string
default:"true"
Whether to run the refinement stage with parallel threads. Set to "true" or "false".
parallelConcurrency
number
default:"3"
Number of concurrent refinement threads when enableParallelRefinement is "true". Higher values produce richer output but increase latency.

SSE events

The response is a stream of SSE events. Each event has an event name and a JSON data payload. Parse them incrementally as they arrive.

Timeline

Stages execute in this order: ingestionpdf_parsingentity_harvestsynthesisnovelty_gaterecovery_plancompleted Each stage emits timeline_stage_started, then timeline_stage_completed (or timeline_stage_skipped if the stage is not applicable given your inputs).

Event reference

Emitted once when the pipeline begins processing your inputs.
{
  "files": 2
}
  • files — number of PDF files received.
Emitted once per PDF file after it has been successfully parsed.
{
  "filename": "mcewen_2012_stress_hippocampus.pdf"
}
Emitted after all PDF scientific analyses finish.
{
  "pdfCount": 2,
  "completed": 2,
  "failed": 0
}
Emitted when a pipeline stage begins.
{
  "stage": "entity_harvest",
  "state": "started",
  "timestamp": "2026-04-05T12:01:05Z",
  "meta": {}
}
Emitted when a pipeline stage finishes successfully.
{
  "stage": "entity_harvest",
  "state": "completed",
  "timestamp": "2026-04-05T12:01:12Z",
  "meta": {
    "entitiesExtracted": 34
  }
}
Emitted when a pipeline stage is skipped (e.g., recovery_plan when no failures occurred).
{
  "stage": "recovery_plan",
  "state": "skipped",
  "timestamp": "2026-04-05T12:02:00Z",
  "meta": {}
}
Emitted as the final event when the pipeline finishes. Contains the full synthesis result.
{
  "synthesis": {
    "selectedIdea": "Chronic cortisol elevation causally suppresses BDNF expression, mediating hippocampal volume reduction via TrkB receptor downregulation.",
    "novelIdeas": [
      "Intermittent glucocorticoid exposure may preserve neuroplasticity through hormetic stress-response pathways.",
      "BDNF-TrkB signalling as a causal mediator between cortisol and spatial memory deficits."
    ],
    "structuredApproach": {
      "hypothesis": "...",
      "proposedDesign": "...",
      "keyVariables": ["cortisol_level", "BDNF_expression", "hippocampal_volume"],
      "identificationStrategy": "back-door adjustment on stress_exposure and age"
    },
    "noveltyGate": {
      "passed": true,
      "score": 0.78,
      "rationale": "Hypothesis is sufficiently distinct from existing literature indexed in the synthesis."
    },
    "timelineReceipt": {
      "stages": ["ingestion", "pdf_parsing", "entity_harvest", "synthesis", "novelty_gate", "completed"],
      "totalDurationMs": 47320
    },
    "runId": "run-4e2f9c10-bb3a-4d8e-a917-c1d0f2b3e456"
  },
  "scientificAnalysis": [
    {
      "filename": "mcewen_2012_stress_hippocampus.pdf",
      "summary": "Reviews glucocorticoid-mediated structural plasticity in hippocampus.",
      "keyFindings": ["Chronic stress reduces CA3 dendritic arborisation", "BDNF is suppressed by elevated cortisol"]
    }
  ],
  "featureContext": {
    "companies": ["DeepMind"],
    "companyInsights": {
      "DeepMind": "Recent work on causal representation learning relevant to neuroscience benchmarks."
    }
  }
}
Emitted if a fatal error occurs. The stream closes after this event.
{
  "message": "PDF parsing failed: file exceeds maximum size."
}

Examples

const formData = new FormData();
formData.append('files', pdfFile1);
formData.append('files', pdfFile2);
formData.append('companies', JSON.stringify(['DeepMind']));
formData.append('researchFocus', 'causal mechanisms of neuroplasticity');
formData.append('enableParallelRefinement', 'true');
formData.append('parallelConcurrency', '3');

const response = await fetch('https://wuweism.com/api/hybrid-synthesize', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer <token>' },
  body: formData,
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  for (const line of chunk.split('\n')) {
    if (line.startsWith('data: ')) {
      const event = JSON.parse(line.slice(6));
      console.log(event);
    }
  }
}
The pipeline has a hard timeout of 300 seconds. If your inputs are large (many PDFs or high parallelConcurrency), ensure your HTTP client does not time out before the complete event arrives. Set your client’s read timeout to at least 310 seconds.