Skip to main content
Use chunking when the extraction output will feed retrieval, embeddings, agent context, search indexing, or review queues. Pick the chunking strategy based on how humans naturally use the document. Platform sample: Attention Is All You Need Chunking configuration in the Platform
StrategyBest for
semanticNarrative documents where topics matter more than page boundaries.
headerReports, policies, filings, and manuals with useful headings.
pageLegal, regulatory, and audit workflows where page provenance matters.
recursiveStrict size windows for embedding models or downstream systems.
{
  "file_url": "https://platform.runpulse.com/api/examples/3be15d23-d622-4f27-9843-ec2929140eec/pdf",
  "extensions": {
    "chunking": {
      "chunk_types": ["semantic", "page"],
      "chunk_size": 1200
    }
  }
}
Chunk results are returned under extensions.chunking. Keep page chunking on when auditors or users must be able to reconcile an answer back to exact page boundaries.

Chunking For RAG

Choose a retrieval-friendly chunking strategy.

LangChain Vector Store

Embed chunks with source metadata.