Processing Parameters

Quick Map

Need	Read	API field	Try it
Link footnote markers to their explanation text	Footnote References	`extensions.footnote_references`	Attention Is All You Need
Parse workbooks without phantom rows or hidden-sheet surprises	Spreadsheet Processing	`spreadsheet.*`	Complex Table Document
Validate exact word positions for review, redaction, or QA	Word-Level Bounding Boxes	`extensions.alt_outputs.wlbb`	Bank Statement
Prepare text for RAG, retrieval, or agent memory	Chunking	`extensions.chunking.*`	Attention Is All You Need

Do not enable every parameter by default. Add the settings your downstream workflow actually uses, especially for high-volume or regulated pipelines where result size, latency, and auditability matter.

Parameter Guides

Footnote References

Link footnote markers to the text they qualify.

Spreadsheet Processing

Control hidden rows, hidden sheets, raw values, and phantom ranges.

Word-Level Bounding Boxes

Return exact word coordinates for review, QA, and overlays.

Chunking

Prepare extraction output for RAG, search, agents, and review queues.

Good Defaults

For self-serve exploration, start in the Platform and inspect the output tabs before saving a reusable extraction configuration.

For production, prefer:

async: true for large files, multi-step workflows, and agent tools.

storage.enabled: true when later Schema, Tables, or Split steps need to reuse the extraction.

Footnotes and page chunks when citations or audit trails matter.

Word-level boxes only for workflows that actually render or validate word coordinates.

Spreadsheet trimming for exported workbooks with inflated used ranges.

Extract API

Full request and response fields.

Build A Platform Pipeline

Configure, run, and reuse processing settings in the Platform.

​Quick Map

​Parameter Guides

Footnote References

Spreadsheet Processing

Word-Level Bounding Boxes

Chunking

​Good Defaults

Extract API

Build A Platform Pipeline

Quick Map

Parameter Guides

Good Defaults