Skip to main content
Cookbooks show how to turn Pulse primitives into complete workflows. Each recipe starts with a real document pattern, explains the right pipeline, and points to the Platform and API surfaces you need to productionize it. For self-serve builders, start with the sample documents and copy the workflow. For regulated teams, use the production notes to add review gates, citations, retries, storage controls, and audit logs before you process customer data.

Start With A Real Example

Sample Documents

Use the same hosted examples shown in the Pulse Platform, with built-in Extract, Schema, Tables, and Split outputs.

Platform Quickstart

Run a workflow visually, inspect the output, then export matching code.

Recipe Index

RecipeUse when
Bank Statement To JSONYou need normalized JSON from a single document family.
Annual Report Split -> SchemaLong documents need section-specific schemas.
Financial TablesRow/column structure matters more than named fields.
Spreadsheet Financial ReviewExcel workbooks need hidden-content controls and raw values.
Chunking For RAGYou need a retrieval-friendly chunking strategy.
LangChain RAG IngestionYou want to index Pulse chunks in a vector store.
Vector Metadata ContractYou need stable metadata for embeddings and audit trails.
Agent Diligence ReviewAn AI agent should use Pulse tools to review documents.
Footnote Citation ReviewFootnotes affect the meaning of extracted text.
Word-Level Review OverlaysA UI needs exact word coordinates on the PDF.
S3 Storage PipelineDocuments and results should stay in cloud storage.
Batch Document IntakeYou need to process many files with retry-safe tracking.
Production WebhooksLong-running jobs should wake your backend on completion.

Extraction Recipes

Bank Statement To JSON

Extract account metadata, summary fields, transactions, and checks with Extract -> Schema.

Annual Report Split -> Schema

Split a long report into topics before applying narrow schemas.

Financial Tables

Reconstruct table-heavy PDFs with the Tables step.

Spreadsheet Financial Review

Parse Excel workbooks with raw values, hidden-content controls, and trimming.

Footnote Citation Review

Link footnote markers to the body text they qualify.

Word-Level Review Overlays

Return exact word coordinates for source-grounded review UIs.

Retrieval And Agent Recipes

Chunking For RAG

Choose semantic, header, page, and recursive chunks for retrieval.

LangChain RAG Ingestion

Convert Pulse chunks into LangChain documents and a vector index.

Vector Metadata Contract

Attach stable metadata to every embedded chunk.

Agent Diligence Review

Let an MCP agent extract, split, schema, and table documents.

Production And Storage Recipes

S3 Storage Pipeline

Process documents from S3 and write results back to cloud storage.

Batch Document Intake

Process many files with retry-safe status tracking.

Production Webhooks

Move long-running jobs into an async, event-driven backend.

Enterprise Patterns

PatternWhy it mattersStart here
Human review with citationsRegulated teams need traceability before data enters a system of record.Schema extraction
Saved configs and change controlTeams need repeatable settings instead of one-off prompts in code.Step Preset Library
Async jobs and webhooksLarge files should not depend on tight polling loops or browser sessions.Production Webhooks
Storage boundariesEnterprise workflows often require customer-controlled storage and retention.Security & Compliance
Recovery pathsProduction integrations need retries, idempotency, and failure visibility.Error Handling

Pick The Right Step

NeedUse
Markdown, citations, figures, chunks, or the first reusable document representationExtract
Known JSON shape such as account metadata, rent roll fields, or invoice totalsSchema
Row and column fidelity for schedules, statements, or financial tablesTables
Different sections need different handling or downstream schemasSplit
See Schema, Tables, Or Split for the full decision guide.