Overview
After extracting a document, you don’t need to start from scratch to adjust your results. The Pulse Playground supports several types of reruns that let you iterate on processing settings, schemas, and splits without re-uploading or re-extracting the document.
Types of Reruns
Re-processes the entire document from scratch with new extraction settings. This re-uploads the original file and runs a fresh extraction.
When to use:
- You want to change extraction settings (page range, figure extraction, chunking)
- The original extraction had issues you want to fix with different parameters
How it works in the Playground:
Navigate to the extraction detail view for a completed extraction.
The re-extraction panel opens with the current settings pre-populated.
Modify page range, figure options, chunking, or any other extraction parameter.
Click Re-Extract. A new extraction job is created with a new extraction_id. The original extraction is preserved.
Full re-extraction creates a new extraction — it does not modify the original. You’ll be navigated to the new extraction result automatically.
API equivalent:
# Re-extract with different settings
result = client.extract(
file=open("document.pdf", "rb"),
pages="1-10", # different page range
extract_figure=True, # enable figures this time
async_=True
)
2. Schema-Only Rerun
Applies a new or modified schema to an existing extraction without re-processing the document. This is the most common type of rerun — fast and cost-effective because extraction is not repeated.
When to use:
- You want to try a different schema on the same document
- You need to fix or refine extracted fields
- You want to extract additional fields from an already-processed document
How it works in the Playground:
Navigate to the extraction detail view and go to the Schema tab.
Modify the JSON Schema in the schema editor, or update the schema prompt.
The schema is re-applied to the existing extraction data. Results update in-place in the Schema tab.
Schema reruns are fast because they operate on the already-extracted content stored in Pulse — no document re-processing needed. You can iterate on schemas as many times as you want.
API equivalent:
# Apply a new schema to an existing extraction
schema_result = client.schema.extract_schema(
extraction_id="existing-extraction-id",
schema_config={
"schema": {
"type": "object",
"properties": {
"total_amount": {"type": "number"},
"due_date": {"type": "string", "format": "date"}
}
},
"schema_prompt": "Extract payment details"
}
)
print(schema_result.schema_output)
Each schema rerun creates a new schema version, letting you compare results across different schema configurations.
3. Split Rerun
Re-runs the document split with different topics on an existing extraction. Useful when you want to reorganize how the document is sectioned.
When to use:
- You want to try different topic definitions
- You want to add or remove topics
- The initial split didn’t assign pages correctly and you want to adjust topic descriptions
How it works in the Playground:
Navigate to the extraction detail view and go to the Split tab.
Add, remove, or edit topic names and descriptions.
The split is re-applied to the existing extraction. Page assignments update based on the new topic definitions.
API equivalent:
# Re-split with different topics
split_result = client.split.document(
extraction_id="existing-extraction-id",
split_config={
"topics": [
{"name": "Executive Summary", "description": "High-level overview"},
{"name": "Financial Details", "description": "Revenue, costs, margins"},
{"name": "Risk Factors", "description": "Identified risks and mitigations"}
]
}
)
print(split_result.split_output)
Rerun Workflow Summary
| Rerun Type | Re-processes document? | Creates new extraction? | Speed |
|---|
| Full Re-Extraction | ✅ Yes | ✅ Yes | Slow (full pipeline) |
| Schema-Only | ❌ No | ❌ No | Fast (seconds) |
| Split Rerun | ❌ No | ❌ No | Fast (seconds) |
Schema Versioning
Each schema rerun creates a versioned result. In the Playground, you can:
- Switch between versions using the version selector in the Schema tab
- Compare results across different schema configurations
- Roll back to a previous version if a new schema didn’t produce better results
This makes it easy to iteratively refine your schema without losing previous work.