> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Cookbooks

> End-to-end Pulse recipes for common document automation workflows.

Cookbooks show how to turn Pulse primitives into complete workflows. Each recipe starts with a real document pattern, explains the right pipeline, and points to the Platform and API surfaces you need to productionize it.

For self-serve builders, start with the sample documents and copy the workflow. For regulated teams, use the production notes to add review gates, citations, retries, storage controls, and audit logs before you process customer data.

## Start With A Real Example

<CardGroup cols={2}>
  <Card title="Sample Documents" icon="file-pdf" href="/cookbooks/sample-documents">
    Use the same hosted examples shown in the Pulse Platform, with built-in Extract, Schema, Tables, and Split outputs.
  </Card>

  <Card title="Platform Quickstart" icon="browser" href="/platform-quickstart">
    Run a workflow visually, inspect the output, then export matching code.
  </Card>
</CardGroup>

## Recipe Index

| Recipe                                                                  | Use when                                                     |
| ----------------------------------------------------------------------- | ------------------------------------------------------------ |
| [Bank Statement To JSON](/cookbooks/invoice-to-json)                    | You need normalized JSON from a single document family.      |
| [Annual Report Split -> Schema](/cookbooks/annual-report-split-schema)  | Long documents need section-specific schemas.                |
| [Financial Tables](/cookbooks/financial-tables-excel)                   | Row/column structure matters more than named fields.         |
| [Spreadsheet Financial Review](/cookbooks/spreadsheet-financial-review) | Excel workbooks need hidden-content controls and raw values. |
| [Chunking For RAG](/cookbooks/chunking-strategy-rag)                    | You need a retrieval-friendly chunking strategy.             |
| [LangChain RAG Ingestion](/cookbooks/rag-langchain-vector-store)        | You want to index Pulse chunks in a vector store.            |
| [Vector Metadata Contract](/cookbooks/vector-metadata-contract)         | You need stable metadata for embeddings and audit trails.    |
| [Agent Diligence Review](/cookbooks/mcp-agent-diligence)                | An AI agent should use Pulse tools to review documents.      |
| [Footnote Citation Review](/cookbooks/footnotes-citation-review)        | Footnotes affect the meaning of extracted text.              |
| [Word-Level Review Overlays](/cookbooks/word-bbox-review)               | A UI needs exact word coordinates on the PDF.                |
| [S3 Storage Pipeline](/cookbooks/byos-s3-ingestion)                     | Documents and results should stay in cloud storage.          |
| [Batch Document Intake](/cookbooks/batch-document-intake)               | You need to process many files with retry-safe tracking.     |
| [Production Webhooks](/cookbooks/webhooks-production)                   | Long-running jobs should wake your backend on completion.    |

## Extraction Recipes

<CardGroup cols={2}>
  <Card title="Bank Statement To JSON" icon="building-columns" href="/cookbooks/invoice-to-json">
    Extract account metadata, summary fields, transactions, and checks with Extract -> Schema.
  </Card>

  <Card title="Annual Report Split -> Schema" icon="diagram-project" href="/cookbooks/annual-report-split-schema">
    Split a long report into topics before applying narrow schemas.
  </Card>

  <Card title="Financial Tables" icon="table" href="/cookbooks/financial-tables-excel">
    Reconstruct table-heavy PDFs with the Tables step.
  </Card>

  <Card title="Spreadsheet Financial Review" icon="file-excel" href="/cookbooks/spreadsheet-financial-review">
    Parse Excel workbooks with raw values, hidden-content controls, and trimming.
  </Card>

  <Card title="Footnote Citation Review" icon="quote-right" href="/cookbooks/footnotes-citation-review">
    Link footnote markers to the body text they qualify.
  </Card>

  <Card title="Word-Level Review Overlays" icon="draw-polygon" href="/cookbooks/word-bbox-review">
    Return exact word coordinates for source-grounded review UIs.
  </Card>
</CardGroup>

## Retrieval And Agent Recipes

<CardGroup cols={2}>
  <Card title="Chunking For RAG" icon="scissors" href="/cookbooks/chunking-strategy-rag">
    Choose semantic, header, page, and recursive chunks for retrieval.
  </Card>

  <Card title="LangChain RAG Ingestion" icon="diagram-project" href="/cookbooks/rag-langchain-vector-store">
    Convert Pulse chunks into LangChain documents and a vector index.
  </Card>

  <Card title="Vector Metadata Contract" icon="database" href="/cookbooks/vector-metadata-contract">
    Attach stable metadata to every embedded chunk.
  </Card>

  <Card title="Agent Diligence Review" icon="robot" href="/cookbooks/mcp-agent-diligence">
    Let an MCP agent extract, split, schema, and table documents.
  </Card>
</CardGroup>

## Production And Storage Recipes

<CardGroup cols={2}>
  <Card title="S3 Storage Pipeline" icon="aws" href="/cookbooks/byos-s3-ingestion">
    Process documents from S3 and write results back to cloud storage.
  </Card>

  <Card title="Batch Document Intake" icon="layer-group" href="/cookbooks/batch-document-intake">
    Process many files with retry-safe status tracking.
  </Card>

  <Card title="Production Webhooks" icon="webhook" href="/cookbooks/webhooks-production">
    Move long-running jobs into an async, event-driven backend.
  </Card>
</CardGroup>

## Enterprise Patterns

| Pattern                          | Why it matters                                                                | Start here                                                     |
| -------------------------------- | ----------------------------------------------------------------------------- | -------------------------------------------------------------- |
| Human review with citations      | Regulated teams need traceability before data enters a system of record.      | [Schema extraction](/platform-reference/extract-schema)        |
| Saved configs and change control | Teams need repeatable settings instead of one-off prompts in code.            | [Step Preset Library](/platform-reference/step-preset-library) |
| Async jobs and webhooks          | Large files should not depend on tight polling loops or browser sessions.     | [Production Webhooks](/cookbooks/webhooks-production)          |
| Storage boundaries               | Enterprise workflows often require customer-controlled storage and retention. | [Security & Compliance](/security/overview)                    |
| Recovery paths                   | Production integrations need retries, idempotency, and failure visibility.    | [Error Handling](/advanced/error-handling)                     |

## Pick The Right Step

| Need                                                                                | Use     |
| ----------------------------------------------------------------------------------- | ------- |
| Markdown, citations, figures, chunks, or the first reusable document representation | Extract |
| Known JSON shape such as account metadata, rent roll fields, or invoice totals      | Schema  |
| Row and column fidelity for schedules, statements, or financial tables              | Tables  |
| Different sections need different handling or downstream schemas                    | Split   |

See [Schema, Tables, Or Split](/concepts/schema-tables-split) for the full decision guide.
