DocsAgentsDrive & artifacts

Drive & artifacts.

Where the agent reads from, where it writes, and how files move through a run. Four areas — drive, workspace, artifacts, knowledge — and a clean lifecycle between them.

§ 01Drive

The drive is the user's file area for a project. It's where uploads land and where the agent reads from. Each project has its own drive; drives do not cross between projects.

Supported formats

Common business and analysis formats are supported out of the box.

Documents — .pdf, .docx, .txt, .md, .json
Spreadsheets — .xlsx, .csv
Images — .png, .jpg, .jpeg, .gif, .webp

Validation

Every upload is validated by magic-number signature, not just file extension. A .pdf that isn't actually a PDF is rejected at upload time. Extensions are allowlisted — formats outside the supported list are refused outright rather than silently uploaded as opaque blobs.

The original filename is retained as metadata. Internally the file may be stored under a normalized identifier, but the agent and the UI see the name you uploaded.

CautionFiles that fail signature validation are rejected with a typed error, not quarantined. If an upload is bouncing, check that the file is actually the format its extension claims.

Full-text extraction

On upload, every supported file goes through a text-extraction pipeline so the agent can read it as text rather than as a blob:

PDFs — text layer extraction, with OCR fallback for scanned pages.
Office — docx/xlsx parsed into structured text and tables.
Images — OCR for any text content.
JSON / CSV / TXT / Markdown — read as text directly.

Extracted text is the substrate the agent reasons over. The original file remains available for download and for tools that need the raw bytes.

§ 02Workspace

The workspace is the agent's scratch space during a run. Intermediate files, working copies, partial outputs, anything the agent produces along the way that isn't the final deliverable.

The workspace is the agent's, not the user's. You don't drop files into it directly. The agent writes here through tool calls and reads from here as the run progresses. Between runs the workspace is cleared unless something has been promoted to an artifact.

§ 03Artifacts

An artifact is a deliverable — the part of a run that survives. Where the workspace is scratch, artifacts are output.

The agent saves an artifact through save_output(). Each artifact has:

A filename — usually descriptive (contract_summary.md, invoices.csv).
A content type — derived from the file content.
A run association — every artifact belongs to exactly one run.
A created_at timestamp.

Artifacts are listed on the run record in the workspace UI and accessible programmatically. They are downloadable as files, viewable inline (where the format allows), and persistent — they survive session expiry, login cycles, and time.

§ 04Knowledge base

The knowledge base is the agent's per-session memory. It captures, in a structured way, the things the agent has read or seen during the session: file contents it has loaded, terminal output it has run, browser pages it has visited, chat messages it has exchanged.

The knowledge base is what lets the agent answer follow-up questions without re-reading the same files. "What was the renewal clause again?" doesn't trigger a fresh read of the contract — it queries the knowledge base.

Knowledge base lifecycle:

Per-session. The knowledge base is scoped to the workspace session.
Survives across runs in the session. Run two queries in the same session and the second one inherits the first's reads.
Cleared with the session. When the session ends, the knowledge base is cleared. Files in the drive and artifacts in the run records remain — only the indexed-memory layer is reset.

§ 05Listing, downloading, deleting outputs

Outputs (artifacts) are exposed as a small REST surface. Three endpoints cover the lifecycle.

List outputs

GET/api/outputs

Returns the artifacts visible to the caller, scoped to their projects. Supports filtering by project, run, and time range.

Fetch a single output

GET/api/outputs/:id

Returns the metadata and content of a specific artifact. Use this to download the file or stream it inline.

Delete an output

DELETE/api/outputs/:id

Soft-deletes an artifact. The artifact is removed from listings; the run record still references it but the content is no longer retrievable.

NoteDeletes are scoped to the owning user. You cannot delete an artifact you do not own, and admin deletes are recorded in the audit trail.

§ 06File preview & viewing

Both drive files and run artifacts can be previewed inline in the workspace UI without downloading.

Markdown / text / JSON — rendered with syntax highlighting.
CSVs — rendered as a sortable table.
PDFs — rendered through the in-app viewer with selection support.
Images — displayed at native resolution with OCR text underlay.

For unsupported formats the workspace falls back to a download link with the original filename and content type.

§ 07Where to go next

How runs use these files end to end → Runtime & subagents.
How jobs produce typed artifacts → Skills & jobs.
The full API surface → API overview.