Errors.
A small, predictable error envelope. Here's how to read it, how to handle each class, and when to retry.
§ 01Error envelope
Every error response uses the same envelope as a successful one — same shape, just different fields populated. Read status first; if it is "error", look at the errors array for the specifics. The HTTP status code tells you the class.
{
"request_id": "req_01HXYZABC123",
"status": "error",
"answer": "",
"sources": [],
"usage": { "tokens_in": 0, "tokens_out": 0, "latency_ms": 12 },
"errors": [
{ "code": "INVALID_BODY", "message": "Field 'query' is required." }
],
"upstream": {}
}
The errors array can carry more than one entry — for example, multiple validation failures on the request body. The first entry is the most important; clients can surface it directly to users.
§ 02HTTP status code reference
Every endpoint uses the same set of HTTP codes. Treat them as the primary signal for control flow; treat the errors array as the secondary, human-facing signal.
| Code | Meaning | When it happens | Suggested action |
|---|---|---|---|
| 200 | OK | Request succeeded. | Use the response. |
| 400 | Bad request | Invalid body, malformed JSON, missing required field, oversize input. | Fix the request and resubmit. Do not retry blindly. |
| 401 | Unauthorized | Missing, malformed, revoked, or inactive credential. | Verify the key, refresh the session, or rotate the credential. Do not retry. |
| 402 | Payment required | Plan ceiling reached — daily token budget, monthly call limit. | Wait for the next period or upgrade the plan. Do not retry. |
| 404 | Not found | Unknown request_id, or the run does not belong to the caller. | Verify the id. Do not retry. |
| 409 | Conflict | Resource state conflicts with the request — e.g. revoking an already-revoked key. | Reconcile state and resubmit. |
| 429 | Too many requests | Rate limit tripped — requests per minute, concurrent streams, tokens per day. | Back off with exponential backoff and jitter. Safe to retry. |
| 500 | Internal error | Unexpected gateway-side failure. | Retry once after 1-2 seconds. If it persists, contact support with the request_id. |
| 502 | Bad gateway | Engine error or unhealthy upstream node. | Retry with backoff. Usually transient. |
| 504 | Gateway timeout | Engine took longer than the synchronous 180s window. | Retry, or switch to streaming for queries that may be slow. |
§ 03Common error codes
The code values inside the errors array are stable strings. Switch on them in code; surface message to humans.
| Code | HTTP | Description |
|---|---|---|
| INVALID_BODY | 400 | Request body is missing a required field, has the wrong type, or is malformed JSON. |
| UNAUTHORIZED | 401 | No credential, malformed credential, or unknown key prefix. |
| KEY_REVOKED | 401 | The presented key has been revoked. Generate a new one. |
| OVER_QUOTA | 402 | Plan ceiling reached. Wait for the next period or upgrade. |
| RATE_LIMITED | 429 | Too many requests in a short window. Back off and retry. |
| UPSTREAM_TIMEOUT | 504 | The engine did not return within the synchronous timeout. Use streaming for slow queries. |
| UPSTREAM_ERROR | 502 | The engine returned an error. Usually transient — retry with backoff. |
| INTERNAL | 500 | Unexpected gateway-side failure. Include the request_id if reporting. |
§ 04Retry guidance
The API is idempotent for the data-plane endpoints — replaying a query produces a new run with a new request_id; it does not mutate previous state. That makes retries safe whenever the underlying error is transient. Two rules cover most cases.
When to retry
- Always on
429,502,504. These are transient by definition. - Once on
500. After one retry, escalate. - OK for cold-start retries on serverless lift-off — the first request to a freshly-deployed function may carry a one-off latency spike.
When not to retry
- Never on
400. The request itself is wrong; retrying will fail identically. - Never on
401. The credential is invalid; rotate or refresh first. - Never on
402. The plan limit is reached; retrying just consumes more attempts.
Exponential backoff
Use exponential backoff with full jitter. Start at 500ms, double on each attempt, cap at 16s, give up after 5 attempts. The combination handles every realistic burst the API will return.
async function withBackoff(fn, { attempts = 5, base = 500, cap = 16000 } = {}) {
let last;
for (let i = 0; i < attempts; i++) {
try {
return await fn();
} catch (err) {
last = err;
const status = err.status ?? 0;
const retryable = status === 429 || status === 502 || status === 504 || (status === 500 && i === 0);
if (!retryable) throw err;
const wait = Math.min(cap, base * 2 ** i) * Math.random();
await new Promise(r => setTimeout(r, wait));
}
}
throw last;
}
504, you are using the wrong endpoint. Move slow queries to /api/v1/query/stream — the stream timeout is 300s and you see progress incrementally.