Citations, by default.
Every source carries a built citation in every supported format. Citations are not generated by the model — they are built by a deterministic pipeline that extracts metadata, formats it, validates it, and persists it alongside the run.
§ 01Citation pipeline
Citations move through five stages, each one auditable.
Metadata extraction
During scraping, the engine pulls bibliographic metadata from the page — title, author or byline, publisher or site name, publication date, access date (the time of scrape), and canonical URL. Where multiple sources of metadata disagree (HTML <meta> vs OpenGraph vs schema.org vs visible byline), the most reliable signal wins.
Builder
The citation builder takes the metadata record and renders it in each supported format. The builder is rules-based and deterministic — the same input always produces the same output. There is no model invocation in this step.
Validation
Each rendered citation is validated against format rules — required fields present, capitalization correct, punctuation correct, URLs well-formed. Validation failures are logged but do not abort the run; the citation is emitted with a warning flag instead.
Persistence
The validated citations are stored against the run, keyed by source and format. They are also attached to the source record itself, so any downstream re-use of the source comes with its citations attached.
Export
Citations are retrievable through the API and the UI in three shapes: a single citation in a specific format, all citations for a run in one format, and a bulk export with all formats together.
§ 02Supported formats
The builder supports seven citation formats. Each is implemented to current style-guide conventions; one-line examples are below.
MLA
Modern Language Association, ninth edition. Used in literature and the humanities.
Example: Doe, Jane. "Title of the Article." Site Name, 14 Mar. 2024, example.com/article. Accessed 1 May 2026.
APA
American Psychological Association, seventh edition. Used in the social sciences.
Chicago Notes
Chicago Manual of Style, notes-and-bibliography variant. Used in history and the humanities.
Chicago Author-Date
Chicago Manual of Style, author-date variant. Used in the natural and social sciences.
IEEE
Institute of Electrical and Electronics Engineers. Numeric, used in engineering and computer science.
Harvard
Harvard-style author-date referencing. Used widely outside the United States.
BibTeX
The reference format used by LaTeX and most academic toolchains. The exporter produces a clean BibTeX entry per source.
@misc{doe2024title,
author = {Doe, Jane},
title = {Title of the Article},
howpublished = {\url{https://example.com/article}},
year = {2024},
month = {mar},
note = {Accessed: 2026-05-01}
}
§ 03Citation metadata fields
Every source carries a structured metadata record that feeds the builder. The full set:
- title — the article or page title.
- author — author or byline; multiple authors supported.
- publisher — publisher or site name.
- year — publication year (and where available, full date).
- access_date — the date the engine scraped the source.
- url — the canonical URL.
§ 04Fallbacks
Real-world web pages frequently lack one or more of these fields. The builder applies conventional fallbacks rather than refusing to emit a citation.
- Missing year →
n.d.("no date") in the rendered citation. - Missing publisher → the bare domain name is used (e.g.
example.com). - Missing author → publisher takes the author position, per each format's rules.
- Missing title → the page's
<title>tag, or as a last resort, the URL slug humanized. - Access date → always present; defaults to the run's scrape time.
§ 05Export endpoints
Citations are exposed through three endpoint shapes.
Per-source, per-format
Fetch a single source's citation in a specific format. Useful when you're building a manuscript and need one entry at a time.
Per-run, per-format
Fetch every citation for a run in a specific format, as a single block. The most common shape for a finished bibliography.
Bulk export
Download a ZIP that contains the citations in every supported format, plus a manifest mapping sources to formats. Useful when you don't yet know which citation style the final document will require.
Embedded HTML
The completed report can be exported as an HTML document with the citations rendered inline at the foot of the document. The format is selectable.
For exact endpoint paths and parameters, see the API reference.
§ 06Validation rules
Each format applies a small set of validation rules before a citation is considered emit-ready.
- Required fields present — at minimum: title, year (or n.d.), URL, access date.
- Punctuation — the format's prescribed punctuation is enforced.
- Capitalization — title-case or sentence-case applied per format.
- Author rendering — first-author last, first, subsequent authors per format.
- URL well-formed — must parse as a URL; tracking parameters stripped.
A citation that fails validation is still emitted, but with a flag on the record so downstream consumers can choose to surface or hide it.