Hybrid synthesize

The hybrid synthesize endpoint runs a multi-stage pipeline that ingests PDFs and company profiles, extracts causal entities and relationships, synthesises them into novel research ideas, and applies a novelty gate before returning a structured synthesis result. The pipeline streams progress as Server-Sent Events (SSE), so your client receives incremental updates as each stage completes.

Authentication is optional. If you include a valid Authorization header, results are associated with your account and persisted in your session history. Without a token, the synthesis runs anonymously and is not saved.

Run a synthesis

POST /api/hybrid-synthesize

Content-Type: multipart/form-data
Response: text/event-stream (SSE)
Max duration: 300 seconds (5 minutes)

Minimum inputs

You must supply at least 2 total sources across files and companies combined. For example: 1 PDF + 1 company, or 2 PDFs, or 2 companies.

Form data parameters

files

File[]

PDF files to include in the synthesis. Maximum 6 files. Each file must be a valid PDF. Files are parsed, scientifically analysed, and their causal entities are extracted before synthesis.

companies

string

JSON-encoded array of company names to include as intelligence sources. Maximum 5 companies. The pipeline retrieves recent research and product context for each company and incorporates it alongside your PDFs.Example: '["DeepMind", "Anthropic", "Meta AI"]'

researchFocus

string

A guiding question or theme that orients the synthesis. The pipeline uses this focus to rank ideas and structure the output.Example: "causal mechanisms of neuroplasticity under chronic stress"

enableParallelRefinement

string

default:"true"

Whether to run the refinement stage with parallel threads. Set to "true" or "false".

parallelConcurrency

number

default:"3"

Number of concurrent refinement threads when enableParallelRefinement is "true". Higher values produce richer output but increase latency.

SSE events

The response is a stream of SSE events. Each event has an event name and a JSON data payload. Parse them incrementally as they arrive.

Timeline

Stages execute in this order: ingestion → pdf_parsing → entity_harvest → synthesis → novelty_gate → recovery_plan → completed Each stage emits timeline_stage_started, then timeline_stage_completed (or timeline_stage_skipped if the stage is not applicable given your inputs).

Event reference

ingestion_start

Emitted once when the pipeline begins processing your inputs.

{
  "files": 2
}

files — number of PDF files received.

pdf_processed

Emitted once per PDF file after it has been successfully parsed.

{
  "filename": "mcewen_2012_stress_hippocampus.pdf"
}

scientific_analysis_complete

Emitted after all PDF scientific analyses finish.

{
  "pdfCount": 2,
  "completed": 2,
  "failed": 0
}

timeline_stage_started

Emitted when a pipeline stage begins.

{
  "stage": "entity_harvest",
  "state": "started",
  "timestamp": "2026-04-05T12:01:05Z",
  "meta": {}
}

timeline_stage_completed

Emitted when a pipeline stage finishes successfully.

{
  "stage": "entity_harvest",
  "state": "completed",
  "timestamp": "2026-04-05T12:01:12Z",
  "meta": {
    "entitiesExtracted": 34
  }
}

timeline_stage_skipped

Emitted when a pipeline stage is skipped (e.g., recovery_plan when no failures occurred).

{
  "stage": "recovery_plan",
  "state": "skipped",
  "timestamp": "2026-04-05T12:02:00Z",
  "meta": {}
}

complete

Emitted as the final event when the pipeline finishes. Contains the full synthesis result.

{
  "synthesis": {
    "selectedIdea": "Chronic cortisol elevation causally suppresses BDNF expression, mediating hippocampal volume reduction via TrkB receptor downregulation.",
    "novelIdeas": [
      "Intermittent glucocorticoid exposure may preserve neuroplasticity through hormetic stress-response pathways.",
      "BDNF-TrkB signalling as a causal mediator between cortisol and spatial memory deficits."
    ],
    "structuredApproach": {
      "hypothesis": "...",
      "proposedDesign": "...",
      "keyVariables": ["cortisol_level", "BDNF_expression", "hippocampal_volume"],
      "identificationStrategy": "back-door adjustment on stress_exposure and age"
    },
    "noveltyGate": {
      "passed": true,
      "score": 0.78,
      "rationale": "Hypothesis is sufficiently distinct from existing literature indexed in the synthesis."
    },
    "timelineReceipt": {
      "stages": ["ingestion", "pdf_parsing", "entity_harvest", "synthesis", "novelty_gate", "completed"],
      "totalDurationMs": 47320
    },
    "runId": "run-4e2f9c10-bb3a-4d8e-a917-c1d0f2b3e456"
  },
  "scientificAnalysis": [
    {
      "filename": "mcewen_2012_stress_hippocampus.pdf",
      "summary": "Reviews glucocorticoid-mediated structural plasticity in hippocampus.",
      "keyFindings": ["Chronic stress reduces CA3 dendritic arborisation", "BDNF is suppressed by elevated cortisol"]
    }
  ],
  "featureContext": {
    "companies": ["DeepMind"],
    "companyInsights": {
      "DeepMind": "Recent work on causal representation learning relevant to neuroscience benchmarks."
    }
  }
}

error

Emitted if a fatal error occurs. The stream closes after this event.

{
  "message": "PDF parsing failed: file exceeds maximum size."
}

Examples

const formData = new FormData();
formData.append('files', pdfFile1);
formData.append('files', pdfFile2);
formData.append('companies', JSON.stringify(['DeepMind']));
formData.append('researchFocus', 'causal mechanisms of neuroplasticity');
formData.append('enableParallelRefinement', 'true');
formData.append('parallelConcurrency', '3');

const response = await fetch('https://wuweism.com/api/hybrid-synthesize', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer <token>' },
  body: formData,
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const chunk = decoder.decode(value);
  for (const line of chunk.split('\n')) {
    if (line.startsWith('data: ')) {
      const event = JSON.parse(line.slice(6));
      console.log(event);
    }
  }
}

import requests
import json

with open('paper1.pdf', 'rb') as f1, open('paper2.pdf', 'rb') as f2:
    response = requests.post(
        'https://wuweism.com/api/hybrid-synthesize',
        headers={'Authorization': 'Bearer <token>'},
        files=[('files', f1), ('files', f2)],
        data={
            'companies': json.dumps(['DeepMind']),
            'researchFocus': 'causal mechanisms of neuroplasticity',
        },
        stream=True,
    )

for line in response.iter_lines():
    if line.startswith(b'data: '):
        event = json.loads(line[6:])
        print(event)

curl --request POST \
  --url https://wuweism.com/api/hybrid-synthesize \
  --header 'Authorization: Bearer <token>' \
  --form 'companies=["DeepMind","Anthropic"]' \
  --form 'researchFocus=causal benchmarks for large language models' \
  --no-buffer

The pipeline has a hard timeout of 300 seconds. If your inputs are large (many PDFs or high parallelConcurrency), ensure your HTTP client does not time out before the complete event arrives. Set your client’s read timeout to at least 310 seconds.

​Run a synthesis

​Minimum inputs

​Form data parameters

​SSE events

​Timeline

​Event reference

​Examples

Run a synthesis

Minimum inputs

Form data parameters

SSE events

Timeline

Event reference

Examples