API Reference

Base URL: https://api.research2llm.com

Authentication

Job endpoints (/jobs) use the X-API-Key header with a key from your dashboard. Account endpoints (/me, /me/keys, /me/usage) use a Bearer JWT from Kinde — these are intended for the dashboard, not for API clients.

POST /jobs

Enqueue a research job. Returns 202 immediately; the work runs asynchronously.

Body

type (string, required) — must be "research". An unknown type is rejected with 400.
payload.topic (string, required) — the research question or topic to investigate.
payload.budget (string, default "medium") — "low" | "medium" | "high". Scales the number of sub-questions, sources scraped and deepening rounds (low ≈ 3 sub-questions, high ≈ 9). Higher budget = more coverage and cost.
payload.preset (string, default "news") — "news" (recent web + Google News, ~1-week freshness, drops PDFs & daily round-ups, strict triage), "general" (broad web, no freshness limit), or "scientific" (keeps papers/PDFs, no freshness limit, longer source bodies).
payload.language (string, default "pl") — ISO code driving BOTH the search locale and the language findings are written in. Separate from the preset (e.g. preset:"news" + language:"en" = English news).
payload.posture (string, default "narrow") — AI opt-out posture. "narrow" only honours retrieval/search opt-outs; "broad" also honours training-only opt-outs (skips more sites). Either way robots.txt is respected.
payload.max_cost_usd (number, optional) — hard spend ceiling. The engine stops early (circuit-breaker) once estimated cost reaches it, regardless of budget.
payload.sub_questions (array of strings, optional) — supply your own decomposition instead of letting the planner generate one.
payload.seed_sources (array, optional) — list of {"url":"..."} or {"text":"..."} objects that anchor the research (read & cited as kind:"seed" sources, then corroborated against fresh web results).
payload.search (object, optional) — override the preset's search: results_per_query (int), freshness (Google tbs, e.g. "qdr:w"; "" = no limit), channel ("web" | "news" | "hybrid"), site_include / site_exclude (host allow/deny lists; social hosts are auto-excluded).
payload.return_source_content (string, default "false") — "markdown" or "html" offloads each scraped source's raw content to storage and returns a content_file_id on the source. (research2llm does not parse social embeds — that's the consumer's job.)
callback_url (string, optional, top-level) — webhook POSTed on completion (success or failure). Validated against SSRF; an unsafe URL is rejected with 400.

Request example

curl -X POST https://api.research2llm.com/jobs \
  -H "X-API-Key: r2l_..." \
  -H "Content-Type: application/json" \
  -d '{
    "type": "research",
    "payload": {
      "topic": "LLM context window limits",
      "budget": "low",
      "seed_sources": [{"text": "optional editor note"}, {"url": "https://optional-link"}]
    }
  }'

Response (202)

{"job_id":"abc-123","status":"queued"}

GET /jobs/{id}

Fetch a job owned by the calling API key. Other tenants' ids return 404 (so existence isn't leaked).

When status is done, result_ref contains verified findings with citations, source metadata, and run statistics.

{
  "job_id": "abc-123",
  "type": "research",
  "status": "done",
  "result_ref": {
    "plan": {"sub_questions": ["..."]},
    "findings": [
      {"id":"f0","text":"...","kind":"fact","citations":["src_1","src_4"],
       "status":"established","confidence":1.0,
       "context":"As stated by ... on 2026-06-18"}
    ],
    "sources": [
      {"id":"src_1","url":"...","title":"...","kind":"web","channel":"web",
       "trust_score":0.6,"opt_out_tier":2,"fetched":true,
       "content_file_id":null,"content_kind":null}
    ],
    "stats": {
      "rounds":1,"sources_scraped":3,"tokens":12345,
      "cost_usd":0.07,"serper_calls":4,"stop_reason":"complete"
    },
    "artifacts": {"raw_learnings_file_id":"..."}
  },
  "error_message": null,
  "callback_url": null,
  "callback_status": null,
  "created_at": "...",
  "started_at": "...",
  "finished_at": "..."
}

status moves through queued → running → done | failed.

Result fields

plan.sub_questions — the decomposition the engine researched (auto-generated, or your sub_questions).
finding.status — established (supported by ≥2 independent sources), reported (supported by a single source), contested (evidence disputes it), NEI (not enough information), unverified.
finding.kind — fact (objective, verifiable), quote (verbatim attributed quotation), or claim (opinion / inference).
finding.context — who/when/where the finding holds, so it can't be misread out of context. citations are the source.ids asserting it; confidence is 0–1.
source.channel — where it came from: web (Google search), news (Google News), or seed (a source you supplied). opt_out_tier / fetched show whether the body was crawled or only the public snippet was used. content_file_id is set when return_source_content offloaded the raw content.
stats — rounds, sources_scraped, tokens, serper_calls, cost_usd, and stop_reason (complete | budget | max_rounds | no_progress).

GET /me

(Bearer JWT) Current user + list of active (non-revoked) API keys.

POST /me/keys

(Bearer JWT) Create a new API key. The raw key is shown once.

# Request
{"label":"Production"}

# Response (201) — raw key shown ONCE
{"id":"...","label":"Production","key_prefix":"r2l_a1b2c3d4","key":"r2l_...","created_at":"..."}

DELETE /me/keys/{id}

(Bearer JWT) Soft-delete a key. Subsequent requests with that key return 401.

GET /me/usage

(Bearer JWT) Usage events for all of the caller's keys, aggregated by kind — powers the dashboard's Usage panel. Per-run token and cost detail lives in each result's stats.

[{"kind":"job","count":12,"quantity_sum":12}]

Errors

400 — unknown job type, unsafe callback URL, or no API key to own the job.
401 — missing/invalid/revoked key, or bad JWT.
403 — account pending approval (key creation blocked).
404 — unknown job, or one owned by another key.
422 — malformed request body.