Skip to main content
POST
/
form
/
detect
Detect Form Fields
curl --request POST \
  --url https://api.runpulse.com/form/detect \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "page_range": "<string>",
  "async": false,
  "form_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "file_url": "<string>"
}
'
{
  "form_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "page_count": 2,
  "form_fields": [
    {
      "page_number": 2,
      "bounding_box": [
        0.5
      ],
      "text": "<string>",
      "type": "text",
      "row": 1,
      "col": 1,
      "table_idx": 1,
      "checkbox_details": [
        {
          "center_coord": [
            0.5
          ],
          "selected": true,
          "text": "<string>"
        }
      ]
    }
  ],
  "pdf_url": "<string>",
  "fields_filled": 1,
  "fields_cleared": 1,
  "credits_used": 123,
  "plan_info": {
    "tier": "<string>",
    "total_credits_used": 123,
    "pages_used": 1
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Detect form fields on a PDF and return them as structured cells along with a reusable form_id. Returns a FormResult synchronously by default. Set async: true to run in the background and poll GET /job/jobId for the result.
/form/detect is the entry point for the form-filler workflow when you want to inspect the fields Pulse identified on a PDF before filling or clearing them. Use it to preview detected fields, fix a misclassified cell, see which checkboxes are currently selected, or cache the detection result for repeated chained calls. The returned form_id references the uploaded PDF and its detected layout, and can be passed back to any of /form/detect, /form/fill, or /form/clear as the single input source. Pulse will reuse the cached layout instead of re-detecting it.

Providing the PDF

Provide the PDF in exactly one of the following ways:
  • form_id: re-detect on a previously stored PDF (returned by an earlier /form/detect, /form/fill, or /form/clear call). Useful when chaining detect calls or refreshing layout after edits.
  • file_url: public or presigned URL to a PDF.
  • file: direct PDF upload (multipart only).
Sending more than one (or none) returns 400.

Pricing

Billed at 1 credit per page of the PDF being processed. Every response also returns a top-level credits_used for this request and a cumulative plan_info.total_credits_used snapshot for your organization.

Request

Request Body

FieldTypeRequiredDescription
form_idstring (uuid)One of theseRe-detect on a previously stored PDF.
file_urlstring (uri)One of thesePublic or presigned URL of a PDF.
filebinaryOne of theseDirect PDF upload (multipart only).
page_rangestringNo1-based page filter, for example "1,3-5". Alias pages accepted.
asyncbooleanNoWhen true, returns { job_id, status: "pending" } immediately (HTTP 202) and processes the job in the background. Default false.

Response

Sync (200): FormResult

When async is false (default), the call returns a FormResult body directly. Since /form/detect does not modify the PDF, neither fields_filled nor fields_cleared is present.
FieldTypeDescription
form_idstring (uuid)ID of the form record produced by this run. Pass to a subsequent /form/detect, /form/fill, or /form/clear call.
page_countintegerNumber of pages in the PDF.
pdf_urlstring (uri)URL to download the (unmodified) PDF binary. Always points at GET /results/jobId/pdf. Requires the same auth as the rest of the API.
form_fieldsarray of FormCellDetected cells. Each carries a normalized bounding_box, a type (text / checkbox / signature), the current text content, and for checkbox cells a checkbox_details[] array with per-box center coordinates, selection state, and labels.
credits_usednumberCredits consumed by this request (1 × page_count).
plan_infoobject{ tier, total_credits_used, pages_used } cumulative billing snapshot for your organization (post-request).
{
  "form_id": "30fe08e1-922e-4012-9dfa-6aed0df430dc",
  "page_count": 6,
  "pdf_url": "https://api.runpulse.com/results/80690a27-ce39-4ad6-a1c7-70c7745238c3/pdf",
  "form_fields": [
    {
      "page_number": 1,
      "type": "text",
      "bounding_box": [0.044, 0.038, 0.222, 0.052],
      "text": "Name (as shown on your income tax return)"
    },
    {
      "page_number": 1,
      "type": "checkbox",
      "bounding_box": [0.118, 0.226, 0.634, 0.241],
      "text": "Individual/sole proprietor C corporation S corporation Partnership",
      "checkbox_details": [
        { "center_coord": [0.125, 0.232], "selected": false, "text": "Individual/sole proprietor" },
        { "center_coord": [0.300, 0.232], "selected": false, "text": "C corporation" },
        { "center_coord": [0.418, 0.232], "selected": false, "text": "S corporation" },
        { "center_coord": [0.535, 0.232], "selected": false, "text": "Partnership" }
      ]
    }
  ],
  "credits_used": 6.0,
  "plan_info": {
    "tier": "pulse_ultra_2",
    "total_credits_used": 1278.0,
    "pages_used": 426
  }
}
All cell coordinates (bounding_box, checkbox_details[].center_coord) are normalized to [0, 1] with a top-left origin. Multiply by your render width / height to convert to pixel coordinates.

Async (202): FormJobAccepted

When async is true:
{
  "job_id": "abc123-def456-ghi789",
  "status": "pending"
}
Poll GET /job/jobId. The job’s result carries the same FormResult shape that the sync flow would have returned inline.

Status Codes

CodeDescription
200Detected layout returned synchronously.
202Async job accepted (async: true). Poll /job/{jobId} for the result.
400Missing PDF or more than one PDF source provided.
401Authentication failed or missing API key.
404Referenced form_id not found (or belongs to a different org).
500Internal server error.

Example Usage

Detect From URL

from pulse import Pulse

client = Pulse(api_key="YOUR_API_KEY")

result = client.form.detect(
    file_url="https://www.irs.gov/pub/irs-pdf/fw9.pdf",
)

print(f"form_id    : {result.form_id}")
print(f"page_count : {result.page_count}")
print(f"# cells    : {len(result.form_fields or [])}")
print(f"credits    : {result.credits_used} (1 x {result.page_count} pages)")

for cell in (result.form_fields or [])[:3]:
    print(f"  [{cell.type}] {cell.bounding_box}  {cell.text!r}")

File Upload

with open("intake-form.pdf", "rb") as f:
    result = client.form.detect(file=f)

Detect, Edit, Then Fill

Detect the cells once, hand-edit any that were misclassified, and pass the edited cells back to /form/fill along with the cached form_id. The fill call reuses the cached layout instead of re-detecting it.
detect = client.form.detect(file_url="https://example.com/contract.pdf")

# Re-tag a cell the detector got wrong
edited = []
for cell in detect.form_fields or []:
    if cell.text and cell.text.strip().lower() == "signature":
        cell.type = "signature"
    edited.append(cell)

fill = client.form.fill(
    form_id=detect.form_id,
    instructions="Sign as Jane Doe, dated 2026-05-01.",
    form_fields=edited,
)

Re-detect On A Stored Form

Pass form_id (instead of file_url / file) to refresh the layout on a PDF already stored by Pulse. Useful after a /form/clear round-trip, or to grab the latest cells if you suspect drift.
fresh = client.form.detect(form_id="00e2c454-4e6f-429b-bd74-320ad94b2153")

Authorizations

x-api-key
string
header
required

API key for authentication

Body

JSON body for POST /form/detect. Provide exactly one of form_id or file_url (or use the multipart variant to upload a file).

page_range
string

Restrict the operation to a subset of pages. Accepts comma-separated page numbers and ranges, e.g. "1-3,5". Alias: pages.

async
boolean
default:false

When true, the endpoint returns immediately with { job_id, status: "pending" } (HTTP 202) and processes the job in the background. Poll GET /job/{jobId} for the result.

form_id
string<uuid>

Re-detect cells on a previously stored PDF. Useful when chaining detect calls or refreshing layout after edits.

file_url
string<uri>

Public or presigned URL of a PDF to detect cells on.

Response

Detected layout returned synchronously.

Result body returned by /form/detect, /form/fill, and /form/clear. For async jobs (async: true) the same shape is served back under result on GET /job/{jobId}.

form_id
string<uuid>
required

ID of the form record produced by this run. Pass to a subsequent /form/detect, /form/fill, or /form/clear call as the single input source to iterate without re-uploading the PDF.

page_count
integer
required

Number of pages in the output PDF.

Required range: x >= 1
form_fields
object[]
required

Detected cells of the resulting PDF (refreshed from the filled / cleared output for fill / clear, or freshly detected for /form/detect).

pdf_url
string

URL to download the resulting PDF binary. Always points at GET /results/{jobId}/pdf for the originating job. Requires the same authentication (API key or JWT) as the rest of the API.

fields_filled
integer

Number of cells whose value actually changed during this run. Present on /form/fill responses only.

Required range: x >= 0
fields_cleared
integer

Number of cells whose value actually changed during this run (no-op clears on already-empty fields are not counted). Present on /form/clear responses only.

Required range: x >= 0
credits_used
number<float>

Credits consumed by this request. Detect charges 1 credit per page; fill and clear charge 3 credits per page.

plan_info
object

Cumulative billing snapshot for the calling organization. Includes the in-flight request's contribution, so every response reflects post-request state.