Skip to main content
Use this page when an AI agent should read documents, choose the right Pulse tools, and return grounded output with minimal custom code.

Fast Path

1

Create an API key

Open the Pulse Platform, create an API key, and keep it out of source control.
2

Connect an MCP client

Use the hosted server for URL-based documents. Use the local server when the agent must read files from disk.
3

Give the agent a document

Hosted MCP needs a public or pre-signed file_url. Local MCP can use either file_url or an absolute file_path.
4

Run extract first

Most workflows start with extract. Use the returned extraction_id for schema, split, and table tools.
5

Poll when needed

If any tool returns { "status": "processing", "job_id": "..." }, call get_job until the job completes.

Client Configs

~/.codex/config.toml
[mcp_servers.pulse]
url = "https://mcp.runpulse.com/mcp"
http_headers = { "x-api-key" = "YOUR_PULSE_API_KEY" }
See Connecting a client for Claude Desktop, VS Code, and more setup options.

First Agent Prompt

Give the agent the document location, the output you want, and any evidence requirements:
Use Pulse to extract this document:
https://platform.runpulse.com/api/examples/637e5678-30b1-45fa-acc4-877f2d636419/pdf

Return account holder, statement period, ending balance, and all transactions as JSON.
Include citations or source references for each field when available. If the extraction
is still processing, poll with get_job until it is complete.
For long or mixed documents, be explicit about the plan:
Use Pulse on this filing. First extract it with page chunking. Then split it into
sections for procedural history, allegations, arguments, and requested relief.
Apply a separate schema to each split section and return the result as JSON with page
references.

Tool Paths

Agent goalTool pathNotes
Parse a document into textextractStart here. Returns markdown and extraction_id.
Get structured JSONextract -> generate_schema -> apply_schemaUse generate_schema when the agent needs help designing fields.
Run a known schemaextract -> apply_schemaPass the schema directly or use a saved schema_config_id.
Route a long packetextract -> split_document -> apply_schemaUse when different pages need different schemas.
Pull tablesextract -> extract_tablesTurn on table merge for cross-page tables.
Process many URLsbatch_extract -> get_jobUse for asynchronous batches.
Run a saved workflowrun_pipeline -> get_jobBest when product teams already maintain a Platform pipeline.

Extraction Settings Agents Should Know

Agents can pass the same high-impact extraction settings you use in the Platform:
SettingUse when
pagesThe user only needs specific pages.
footnote_referencesFootnote markers and body text need to be linked.
figure_descriptionsCharts, images, or diagrams need text descriptions.
show_imagesThe app needs image URLs for extracted visuals.
chunk_types and chunk_sizeOutput will feed retrieval, memory, or embeddings.
only_data_rows and only_data_colsExcel files have large empty trailing ranges.
For the broader Platform/API view of these settings, see Processing Parameters.

Agent Guardrails

  • Hosted MCP cannot read local disk paths. Use a public or pre-signed URL.
  • Local MCP can read an absolute file_path, but chat attachments are not automatically file paths.
  • Always reuse extraction_id instead of re-extracting the same document.
  • Poll get_job before passing a result into the next tool.
  • Ask for a schema before writing JSON to a database or downstream system.
  • Keep API keys out of prompts, logs, screenshots, and committed config.
  • For regulated workflows, return page references, footnote links, chunks, or bounding boxes whenever users need evidence.

MCP Tools

Full tool parameters and return shapes.

Sample Documents

Public documents agents can use for smoke tests.