Overview
The batch endpoints let you run any step of the Pulse pipeline across many documents at once. Each batch call is fully asynchronous — it returns abatch_job_id immediately and orchestrates parallel workers behind the scenes.
Poll GET /job/batch_job_id for real-time progress including per-item completion status and individual child job IDs.
Batch endpoints mirror the individual pipeline steps. Each child call goes through the exact same code path as calling the individual endpoint directly — batch is orchestration, not a separate implementation.
Pipeline
Batch endpoints can be chained together, just like their single-document counterparts: Each step takes the output of a previous step as input, either via abatch_extract_id / batch_split_id that references the parent batch job, or via an explicit list of individual IDs.
Workers
Workers process items in parallel. You can control concurrency with theworkers parameter on every batch endpoint.
| Parameter | Type | Default | Max | Description |
|---|---|---|---|---|
workers | integer | 4 | 10 | Number of parallel workers |
Batch Extract
Enumerate files from an input source and extract content from each one.See Extract for details on
extract_options (pages, figure processing, extensions, etc.).Request — POST /batch/extract
| Field | Type | Required | Description |
|---|---|---|---|
input | object | Yes | Source of files to process (see Input Sources) |
output | object | Yes | Where to save extraction results (see Output Destinations) |
extract_options | object | No | Options forwarded to each /extract call |
workers | integer | No | Parallel workers (default: 4, max: 10) |
Response (202)
| Field | Type | Description |
|---|---|---|
batch_job_id | string | Job ID for polling |
status | string | "processing" |
total_files | integer | Number of files that will be processed |
Example
Batch Schema
Apply structured data extraction to multiple items. Supports two modes, inferred from input:- Single mode — Provide
extraction_idsorbatch_extract_idwithschema_config - Split mode — Provide
split_idsorbatch_split_idwithsplit_schema_config
See Schema for details on
schema_config, split_schema_config, and the difference between single and split modes.Request — POST /batch/schema
| Field | Type | Required | Description |
|---|---|---|---|
output | object | Yes | Where to save schema results |
batch_extract_id | string | XOR | ID of a prior batch extract run (single mode) |
extraction_ids | array | XOR | Explicit list of extraction IDs (single mode) |
batch_split_id | string | XOR | ID of a prior batch split run (split mode) |
split_ids | array | XOR | Explicit list of split IDs (split mode) |
schema_config | object | Conditional | Schema configuration for single mode |
split_schema_config | object | Conditional | Per-topic schema configurations for split mode |
workers | integer | No | Parallel workers (default: 4, max: 10) |
Response (202)
| Field | Type | Description |
|---|---|---|
batch_job_id | string | Job ID for polling |
status | string | "processing" |
total_extractions | integer | Number of extractions to process (single mode) |
total_splits | integer | Number of splits to process (split mode) |
Example — Single Mode
Example — Split Mode
Batch Tables
Extract tables from multiple existing extractions.See Tables for details on
tables_config (merge, table format, etc.).Request — POST /batch/tables
| Field | Type | Required | Description |
|---|---|---|---|
output | object | Yes | Where to save table results |
batch_extract_id | string | XOR | ID of a prior batch extract run |
extraction_ids | array | XOR | Explicit list of extraction IDs |
tables_config | object | No | Table extraction configuration |
workers | integer | No | Parallel workers (default: 4, max: 10) |
Response (202)
| Field | Type | Description |
|---|---|---|
batch_job_id | string | Job ID for polling |
status | string | "processing" |
total_extractions | integer | Number of extractions to process |
Example
Batch Split
Split multiple extractions into topics.See Split for details on
split_config (topic definitions with names and descriptions).Request — POST /batch/split
| Field | Type | Required | Description |
|---|---|---|---|
output | object | Yes | Where to save split results |
split_config | object | Yes | Split configuration with topic definitions |
batch_extract_id | string | XOR | ID of a prior batch extract run |
extraction_ids | array | XOR | Explicit list of extraction IDs |
workers | integer | No | Parallel workers (default: 4, max: 10) |
Response (202)
| Field | Type | Description |
|---|---|---|
batch_job_id | string | Job ID for polling |
status | string | "processing" |
total_extractions | integer | Number of extractions to process |
Example
Input and Output
Input Sources
Batch Extract accepts one of the following input sources:| Source | Field | Example |
|---|---|---|
| S3 prefix | s3_prefix | s3://my-bucket/documents/ |
| Local directory | local_path | /data/documents/ |
| URL list | file_urls | ["https://example.com/doc.pdf"] |
Output Destinations
Every batch endpoint writes results to an output destination. You can specify one or both:| Destination | Field | Example |
|---|---|---|
| S3 prefix | s3_prefix | s3://my-bucket/results/ |
| Local directory | local_path | /data/results/ |
Monitoring Progress
Poll GET /job/batch_job_id to monitor a batch job. The response includes aresult object with structured progress:
job_id can be polled individually for detailed results.
Polling Example
Cancellation
Cancel a batch job with DELETE /job/batch_job_id. This cascades to all child jobs that are still pending or processing.Full Pipeline Example
Process a folder of SEC filings: extract all files, apply a schema, extract tables, split by topic, and apply per-topic schemas.Related Endpoints
Extract
Individual file extraction — config options apply to Batch Extract
Schema
Single/split schema extraction — config options apply to Batch Schema
Tables
Table extraction — config options apply to Batch Tables
Split
Topic splitting — config options apply to Batch Split
Poll Job
Poll batch job progress
Cancel Job
Cancel a batch job and all child jobs
