Document Lifecycle

Pulse has a small set of IDs that make the platform and API feel predictable once you know what each one represents.

Core Objects

Object	Created by	Used for
`job_id`	Any async request	Polling job status and retrieving async results.
`extraction_id`	`/extract` with storage enabled	Reusing parsed content for Schema, Split, Tables, reruns, and saved results.
`split_id`	`/split`	Applying per-topic schemas or table extraction to split page groups.
`schema_id`	`/schema`	Tracking a structured output version and downloading filled Excel templates when used.
`tables_id`	`/tables`	Tracking table extraction output, especially async table jobs.
Config IDs	Saved presets	Reusing Extract, Split, Schema, and Tables settings without inlining JSON.

Storage Defaults

Storage is enabled by default for normal workflows because downstream steps need saved extraction artifacts. If storage is disabled, Pulse can still return the immediate Extract response, but later steps may not be able to reuse that extraction.

Do not disable storage if you plan to call /schema, /split, /tables, run partial reruns, or inspect the result in the Platform.

Async Lifecycle

For longer documents or heavier steps, set async: true.

The request returns quickly with a job_id.
Your app polls GET /job/{jobId} or waits for a webhook.
When status is completed, the job result contains the same output the sync call would have returned.
Large results may include a download URL instead of embedding the entire payload.

Platform Lifecycle

In the Platform, the same lifecycle appears as a visual pipeline:

Upload or select a document.
Configure Extract and optional downstream steps.
Run the pipeline.
Inspect outputs in Markdown, Tables, Split, and Schema views.
Save step presets or a full pipeline preset.
Use Show Code to reproduce the pipeline from the SDK.

Production Lifecycle

A mature production workflow usually looks like this:

Chaining Steps

Learn the exact ID handoffs between Extract, Split, Schema, Tables, and Jobs.

Async Processing

Decide when to run jobs asynchronously and how to poll safely.

​Core Objects

​Storage Defaults

​Async Lifecycle

​Platform Lifecycle

​Production Lifecycle

​Related

Chaining Steps

Async Processing

Core Objects

Storage Defaults

Async Lifecycle

Platform Lifecycle

Production Lifecycle

Related