API Reference

curl -X POST https://dev.api.runpulse.com/extract \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file-url": "https://example.com/document.pdf",
    "schema": {"invoice_number": "string"},
    "pages": "1-10"
  }'

They also support direct file upload via multipart/form-data:

curl -X POST https://dev.api.runpulse.com/extract \
  -H "x-api-key: YOUR_API_KEY" \
  -F "file=@document.pdf" \
  -F "schema={\"invoice_number\":\"string\"}" \
  -F "pages=1-10"

Convert Endpoint

The /convert endpoint accepts files via multipart/form-data:

curl -X POST https://dev.api.runpulse.com/convert \
  -H "x-api-key: YOUR_API_KEY" \
  -F "file=@document.pdf"

Response Format

Successful Response

{
  "content": "# Extracted Content\n\nDocument text in markdown format...",
  "metadata": {
    "page_count": 10,
    "processing_time": 2.34,
    "source_file": "document.pdf"
  }
}

Large Document Response

For documents over 70 pages, content is delivered via S3 URL:

{
  "is_url": true,
  "url": "https://pulse-studio-api.s3.amazonaws.com/results/...",
  "expires_at": "2024-01-16T12:00:00Z",
  "plan-info": {
    "pages_used": 150,
    "tier": "professional"
  }
}

Error Response

{
  "error": {
    "code": "FILE_001",
    "message": "Invalid file type",
    "details": {
      "supported_types": ["PDF", "JPG", "PNG", "DOCX", "PPTX", "XLSX", "HTML"],
      "received_type": "DOC"
    }
  }
}

Common Parameters

Schema Parameter

Define structured data extraction:

{
  "field_name": "data_type",
  "nested_object": {
    "field": "string"
  },
  "array_field": ["string"]
}

Supported data types:

string - Text values
integer - Whole numbers
float - Decimal numbers
date - Date values
boolean - True/false
array - Lists
object - Nested structures

Pages Parameter

Specify page ranges to process:

Single page: "5"
Range: "1-10"
Multiple: "1,3,5-7,10"
All pages: Omit parameter

Output Options

Control extraction output:

return_html: Return HTML instead of markdown (default: false)
extract_figure: Extract images and figures (default: false)
figure_description: Generate AI descriptions for figures
chunk_size: Custom chunk size in characters

Status Codes

Code	Description
200	Success
400	Bad Request - Invalid parameters
401	Unauthorized - Invalid API key
403	Forbidden - Access denied
404	Not Found - Resource doesn’t exist
413	Payload Too Large - File exceeds limit
429	Too Many Requests - Rate limited
500	Internal Server Error
503	Service Unavailable

Best Practices

Use Appropriate Endpoints

Handle Errors Gracefully

Optimize Performance

Security

Next Steps

Explore specific endpoints:

Extract Endpoint

Main extraction endpoint

Quickstart Guide

Get started quickly

Endpoints

​Base URL

​Authentication

​Available Endpoints

Extract

Extract Async

Convert

Get Job Status

Cancel Job

Configure Webhooks

​Request Format

​Extract and Extract Async Endpoints

​Convert Endpoint

​Response Format

​Successful Response

​Large Document Response

​Error Response

​Common Parameters

​Schema Parameter

​Pages Parameter

​Output Options

​Status Codes

​Best Practices

​Next Steps