Skip to main content
This walkthrough builds a simple invoice extraction workflow in the Platform, then turns it into production-ready code.

Target Workflow

Steps

1

Open New Extraction

Go to the Platform and open New Extraction from the sidebar.
2

Upload a representative invoice

Use a real sample format if you can. Representative documents matter more than perfect toy examples.
3

Run Extract first

Keep the first run simple. Run Extract and inspect the Markdown and Tables tabs so you know what Pulse saw.
4

Add Schema

Add a Schema step. Start with the fields you need in production: invoice number, vendor, dates, total, and line items.
5

Review citations

Confirm the output values point to the right source locations. If a field is ambiguous, improve its description or add a schema prompt.
6

Save the schema preset

Save the schema once it works on more than one invoice. This gives you a schema_config_id for production code.
7

Show Code

Open Show Code and choose Python, TypeScript, or cURL. The generated code will match your tested configuration.

Visual Walkthrough

Start with a single Extract step and keep the first run simple. Pulse Platform upload screen with Extract settings open Add the next step from the pipeline menu once the extraction output looks usable. Pipeline step menu with Split, Tables, and Schema options Configure each step in the right-side panel. For this tutorial, add Schema after Extract. Pulse pipeline configuration screen with Extract, Split, and Schema steps Review the source document and output side by side before you save presets or export code. Pulse Playground showing extracted markdown and table output beside a source PDF Use Show Code when the tested pipeline is ready to move into your app. Show Code modal with generated Python for a Pulse extraction pipeline

Starter Schema

{
  "type": "object",
  "properties": {
    "invoice_number": {
      "type": "string",
      "description": "The invoice identifier shown by the vendor"
    },
    "vendor_name": {
      "type": "string",
      "description": "The vendor issuing the invoice"
    },
    "invoice_date": {
      "type": "string",
      "description": "The date the invoice was issued"
    },
    "due_date": {
      "type": "string",
      "description": "The payment due date"
    },
    "total_amount": {
      "type": "number",
      "description": "The final amount due"
    }
  },
  "required": ["invoice_number", "vendor_name", "total_amount"]
}

Improve The Pipeline

Once the basic flow works, make it more production-ready:
NeedAdd
Only certain pages matterPage range on Extract
Line item tables need structureTables step
Many invoice formatsBetter field descriptions and a schema prompt
Repeatable production configSaved Extract and Schema presets
High volumeBatch or async processing
Long jobsWebhooks or polling

Move To Code

Generated code should become the starting point for your app integration. In production, replace sample paths and keys with your own inputs and secret management.
import os
from pulse import Pulse

client = Pulse(api_key=os.environ["PULSE_API_KEY"])

result = client.extract(
    file=open("invoice.pdf", "rb"),
)

schema_result = client.schema(
    extraction_id=result.extraction_id,
    schema_config_id="YOUR_SAVED_SCHEMA_CONFIG_ID",
)

print(schema_result.schema_output["values"])

Bank Statement To JSON

A complete Extract -> Schema API recipe with a public sample document.

Platform To Production

Use presets and Show Code well.