DocsGuidesIntegration patterns

Integration patterns.

A practical catalogue of the four shapes most teams settle into when they wire Essarion into their own product — when each one fits, what it costs, and the load-bearing details you'll want to get right the first time.

NoteNone of these patterns are mutually exclusive. Most production deployments use Pattern A as their baseline and layer Pattern B or Pattern C on top for specific surfaces.

§ 01Pattern A — server-side proxy

The default. Your backend holds the esk_ key. Your client never sees it. Every call from your frontend goes to your API, which authenticates the user with your own auth, then forwards the request to api.essarion.com with the bearer token.

Conceptually:

client → your API → Essarion API → upstream engine

Here's a minimal Express handler. It accepts a POST /research from your authenticated client, attaches your Essarion key, forwards the body, and streams the JSON response back.

node

import express from "express";
const app = express();
app.use(express.json());

app.post("/research", requireUser, async (req, res) => {
  const upstream = await fetch("https://api.essarion.com/api/v1/query", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.ESSARION_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ query: req.body.query }),
  });

  const data = await upstream.json();

  // log run association for audit + billing
  await db.runs.insert({
    user_id: req.user.id,
    request_id: data.request_id,
    cost_usd: data.usage?.cost_usd ?? 0,
    created_at: new Date(),
  });

  res.json({
    request_id: data.request_id,
    answer: data.answer,
    citations: data.citations,
  });
});

Pros. The key never leaves your infrastructure. You get a single chokepoint for audit logging, rate-shaping, prompt sanitization, and per-user attribution. You can reject queries that violate your own policy before they ever hit Essarion.

Cons. One extra hop on every request. You're responsible for matching Essarion's timeouts (be generous — 120 seconds at minimum). Streaming through this hop requires a little more work; see Pattern C.

TipStrip usage.cost_usd from the response before sending to your client unless you intentionally surface it. End-users don't need to see Essarion's per-token costs; they need to see your billing.

§ 02Pattern B — async background jobs

For deeper queries — five-minute multi-source research runs, batch reports, anything where you don't want a request hanging — submit the query, return a job ID immediately, and resolve completion out-of-band.

The trick is that Essarion's request_id is already a perfectly good job key. You don't need to invent your own.

node

// 1. submit and return immediately
app.post("/jobs", requireUser, async (req, res) => {
  const r = await fetch("https://api.essarion.com/api/v1/query", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.ESSARION_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ query: req.body.query, async: true }),
  });
  const { request_id } = await r.json();

  await db.jobs.insert({ user_id: req.user.id, request_id, status: "running" });
  res.status(202).json({ job_id: request_id });
});

// 2. client polls
app.get("/jobs/:id", requireUser, async (req, res) => {
  const r = await fetch(`https://api.essarion.com/api/v1/runs/${req.params.id}`, {
    headers: { "Authorization": `Bearer ${process.env.ESSARION_KEY}` },
  });
  res.json(await r.json());
});

Pros. Long queries don't block your UI. You can fan out hundreds of jobs in parallel without holding hundreds of connections open. Failures are recoverable — the run keeps going on Essarion's side even if your worker dies.

Cons. Polling introduces latency. If you poll every 2 seconds, the user waits up to 2 seconds after completion to see the result. Webhook-based completion (currently in private beta) avoids this.

TipIf you're already running a job queue (BullMQ, Sidekiq, Celery), don't stand up a second one. Just enqueue a poll-essarion-run job with the request_id and let your existing queue manage the lifecycle.

§ 03Pattern C — streaming relay

When you want live, token-by-token output in a user's browser and you can't ship the API key to the client, you need to relay. Your client opens an SSE connection to your server. Your server opens an upstream connection to Essarion's stream endpoint and forwards each event as it arrives.

node

app.post("/stream", requireUser, async (req, res) => {
  res.setHeader("Content-Type", "text/event-stream");
  res.setHeader("Cache-Control", "no-cache");
  res.setHeader("Connection", "keep-alive");
  res.flushHeaders();

  const upstream = await fetch("https://api.essarion.com/api/v1/query/stream", {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${process.env.ESSARION_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ query: req.body.query }),
  });

  const reader = upstream.body.getReader();
  const decoder = new TextDecoder();
  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    res.write(decoder.decode(value));
  }
  res.end();
});

Pros. Live UI. Users see progress, sources, partial reasoning. The key never reaches the browser.

Cons. Long-lived connections — your infrastructure has to be sized for them. Many PaaS hosts (Vercel, Heroku, Cloud Run) have hard request timeouts at 30, 60, or 300 seconds; check yours. Some load balancers buffer SSE by default; see Streaming · Common pitfalls.

CautionDon't pass the upstream response straight through with upstream.body.pipe(res) without flushing. Some Node frameworks buffer the response by default. Flush headers first and write chunks explicitly.

§ 04Pattern D — embedded research UI

For partners who want their end-users to live inside ResearchAnything.ai itself — with their own branding and SSO — there's an embedded mode. You bring your own auth, we bridge it to Essarion accounts via OAuth, and your users see a co-branded research surface.

This is invite-required and out of scope for self-serve docs. The high-level shape:

You register an OAuth client with Essarion.
Your users sign in to your product.
You exchange your user's identity for an Essarion-issued session token via the OAuth bridge.
Your users open research.essarion.com/embed?token=... inside an iframe or in a popout.

NoteIf this fits your use-case, request access. The embedded surface is reviewed case-by-case because it touches branding, plan accounting, and per-user identity.

§ 05Storing run timelines

Every query produces a request_id. That ID is the canonical handle for the run forever. Persist it in your database at submit time, not at completion — if the client disconnects mid-run, you can still recover the result.

You have two reasonable storage strategies:

Lazy / on-demand. Store only the request_id, the user, and a timestamp. Fetch the timeline from /api/v1/runs/{request_id} when someone asks for it. Cheapest. Best for low-traffic apps and admin tools.
Eager / cached. On completion, fetch the full timeline once and cache the JSON in your own database. Now timelines are still available even if the user's plan changes or Essarion's retention window passes.

TipRun timelines are retained for 90 days on standard plans, 12 months on enterprise. If you need them longer than your plan allows, cache eagerly.

§ 06Cost shaping

Essarion bills you per token at engine prices. You decide how to bill your users. Most teams want some combination of: monthly token caps per user, hard refusal beyond a budget, and a usage UI that's denominated in their own product's terms (queries, reports, credits).

The basic loop:

Track tokens-spent-per-user in your own DB by reading usage.tokens_in and usage.tokens_out from each response.
On every incoming request, sum the user's spend for the current period.
If they're over their cap, reject before forwarding to Essarion.
Surface remaining budget to the user via your own dashboard.

CautionDo not expose Essarion's /api/usage endpoint directly to your end-users. It returns your account's totals across all of your customers — the wrong number for any one of them. Aggregate per-user yourself from the per-request usage field.

For very granular cost shaping — model routing, max-token caps per query, plan-tier limits — see the policy object in API · Queries.

§ 07Where to go next

How streaming actually works on the wire → Streaming with SSE
Working with citations in your output → Working with citations
Full endpoint reference → API · Queries
Authentication details → API · Authentication