extraction_id. The question is what you do next.
Quick Picker
| Workflow | Use when | Avoid when |
|---|---|---|
| Extract | You need markdown, layout-aware text, figures, chunks, or general document content. | You need normalized fields or table reconstruction as the final output. |
| Extract -> Schema | One schema describes the whole document: invoices, applications, policies, statements. | Different sections need different fields or instructions. |
| Extract -> Tables | You need table HTML with row/column structure, merged tables, or chart-to-table conversion. | You only need a few values from a table; Schema may be simpler. |
| Extract -> Split | You need page groups by topic for routing, review, or downstream processing. | The whole document should be processed with one schema. |
| Extract -> Split -> Schema | Long mixed documents need per-topic schemas: annual reports, diligence packs, claim files. | A single schema can handle the document reliably. |
| Batch | You need the same workflow across many documents. | You are still designing the pipeline on a single sample. |
Common Decisions
Schema or Tables?
Use Schema when your final output is a JSON object with named fields:Split or Page Range?
Use a page range when you already know where the content lives, such as1-5 for a cover memo.
Use Split when the location changes across documents or when topics are semantic rather than fixed by page number. Split assigns pages to named topics and returns a split_id you can reuse with Schema or Tables.
Inline Config or Saved Preset?
Use inline config while you are experimenting. Use saved presets once a workflow is stable:- Extract presets store settings like page range, figures, chunking, and spreadsheet options.
- Split presets store topic names and descriptions.
- Schema presets store JSON Schema and prompts.
- Table presets store merge and chart-to-table settings.
Recommended Path
- Start in the Platform Quickstart with one representative document.
- Use this picker to choose your next step.
- Save presets only after output quality looks right.
- Use Show Code to move into the API.
- Use Chaining Steps to understand how IDs connect each step.
You can always rerun Schema, Split, or Tables from a saved extraction. You usually do not need to upload and extract the document again while iterating on downstream steps.