Extract File

Extract Document

curl --request POST \
  --url https://dev.api.runpulse.com/extract \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "file-url": "<string>",
  "schema": {
    "invoice_number": "string",
    "total": "float"
  },
  "pages": "1-10",
  "chunk_size": 5000,
  "extract_figure": false,
  "figure_description": false,
  "return_html": false,
  "schema_prompt": "<string>"
}'

{
  "content": "<string>",
  "metadata": {
    "page_count": 123,
    "processing_time": 123,
    "source_file": "<string>"
  },
  "schema_data": {}
}

POST

extract

Extract Document

curl --request POST \
  --url https://dev.api.runpulse.com/extract \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '{
  "file-url": "<string>",
  "schema": {
    "invoice_number": "string",
    "total": "float"
  },
  "pages": "1-10",
  "chunk_size": 5000,
  "extract_figure": false,
  "figure_description": false,
  "return_html": false,
  "schema_prompt": "<string>"
}'

{
  "content": "<string>",
  "metadata": {
    "page_count": 123,
    "processing_time": 123,
    "source_file": "<string>"
  },
  "schema_data": {}
}

Large Document Processing: For documents over 50 pages, we strongly recommend using the /extract_async endpoint instead. The async endpoint prevents timeout issues and provides better handling of extensive processing tasks.

Large Document Response Structure

For documents exceeding 70 pages, the API returns a URL structure instead of the direct response:

{
  "is_url": true,
  "url": "https://pulse-studio-api.s3.region.amazonaws.com/results/...",
  "plan-info": {
    "pages_used": 0,
    "tier": "foundation"
  }
}

Key Points:

Documents over 70 pages return a URL containing the complete response
URLs automatically expire after 24 hours
No changes required to your API request structure
For documents under 70 pages, the API continues to return results directly

Implementation Tips:

Check if response contains "is_url": true
If true, fetch the complete document data from the provided URL
Store URLs securely as they contain your processed results

Authorizations

x-api-key

string

header

required

API key for authentication

Body

application/json

Response

200

application/json

Successful extraction

The response is of type object.

Schema Guidelines Extract File Async

API Reference

Endpoints

Large Document Response Structure

Key Points:

Implementation Tips:

Authorizations

Body

Response

API Reference

Endpoints

​Large Document Response Structure

​Key Points:

​Implementation Tips:

Authorizations

Body

Response

Large Document Response Structure

Key Points:

Implementation Tips: