Getting Started

Learn how to quickly integrate with our API endpoints for document extraction and processing.

Prerequisites:

  • You’ll need an API key and an active subscription tier to get started. Contact our support team if you haven’t received these yet.
  • Check out our pricing page for subscription options

Choose Your Integration Path

Quick Implementation

Using the Convert Endpoint

Upload File

Send your PDF directly to the endpoint:

curl --location 'https://api.runpulse.com/convert' \
  --header 'Content-Type: application/pdf' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data-binary '@/path/to/your/file.pdf'

Retrieve Results

You’ll receive a JSON response with presigned URL and S3 object URL:

{
  "presigned_url": "https://bucket-name.s3.amazonaws.com/path/to/file.pdf?[signed-parameters]",
  "s3_object_url": "https://bucket-name.s3.amazonaws.com/path/to/file.pdf",
  "statusCode": 200
}

Using the Extract Endpoint

Prepare Your Request

Include your API key in the headers and specify the file URL:

curl --location 'https://api.runpulse.com/extract' \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: YOUR_API_KEY' \
  --data-raw '{
    "file-url": "YOUR_FILE_URL",
    "chunking": "semantic,recursive",
    "return_table": true,
    "schema": {}
  }'

Examples of chunking combinations:

  • chunking=semantic,recursive
  • chunking=page,header
  • chunking=semantic,recursive,page,header
  • chunking=semantic

Process Response

You’ll receive a JSON response containing:

  • Markdown content from the document
  • Chunked content based on your specified chunking methods
  • Schema-based extracted data (if schema was provided)
  • Tables extracted from the document (if return_table was set to true)
  • Plan information including usage metrics

Available Chunking Options

Semantic

Splits content based on meaning and context

Recursive

Iteratively breaks down content into smaller segments

Page

Divides content by individual pages

Header

Splits content based on document headers

Response Format

{
  "markdown": "# Sample Purchase Order\n\n**Example Company Inc**  \n123 Main Street  \nUnit 2  \nBoston Massachusetts 02101  \nUSA  \n\n---\n\n**Purchase Order**  \n**# PO-12345**",
  "chunking": {
    "recursive": [
      {
        "chunk_number": 1,
        "content": "# Sample Purchase Order",
        "length": 29,
        "method": "recursive"
      },
      {
        "chunk_number": 2,
        "content": "**Example Company Inc**  \n123 Main Street  \nUnit 2  \nBoston Massachusetts 02101",
        "length": 95,
        "method": "recursive"
      }
    ],
    "semantic": [
      {
        "chunk_number": 1,
        "content": "Sample Purchase Order from Example Company Inc located at 123 Main Street, Unit 2, Boston Massachusetts 02101",
        "length": 112,
        "method": "semantic"
      }
    ]
  },
  "schema-json": {
    "company": "Example Company Inc",
    "purchase_order_number": "PO-12345",
    "address": {
      "street": "123 Main Street",
      "unit": "Unit 2",
      "city": "Boston",
      "state": "Massachusetts",
      "zip": "02101",
      "country": "USA"
    }
  },
  "tables": [
    {
      "table_id": 1,
      "content": "| #  | Item & Description | Qty  | Rate     | Amount   |\n|----|-------------------|------|----------|----------|\n| 1  | Setup Fee         | 1.00 | 1,000.00 | 1,000.00 |\n| 2  | Product A         | 50.00| 30.00    | 1,500.00 |"
    }
  ],
  "plan-info": {
    "pages_used": 19998,
    "tier": "foundation"
  }
}

Response Fields Explained

  • markdown: The full document text converted to markdown format
  • chunking: Contains different chunking methods and their results
  • schema-json: Data extracted according to your provided schema
  • tables: Tables extracted from the document (when return_table is true)
  • plan-info: Information about your subscription usage and limits

Next Steps

Need help? Email us at founders@trypulse.ai.