Bounding Boxes

Overview

When extracting content with layout information, Pulse API returns bounding box coordinates for text, tables, and images. This spatial data enables precise document understanding and region-based extraction.

Bounding Box Format

Bounding boxes are returned as normalized coordinates (0-1 range) in an 8-point format:

[x1, y1, x2, y2, x3, y3, x4, y4]

Where:

(x1, y1) = Top-left corner
(x2, y2) = Top-right corner
(x3, y3) = Bottom-right corner
(x4, y4) = Bottom-left corner

Coordinates are normalized to 0-1 range, making them resolution-independent. To convert to pixels, multiply by the page width/height.

Response Structure

The bounding_boxes object in the extraction response contains:

{
  "bounding_boxes": {
    "Footer": [],
    "Header": [],
    "Images": [],
    "Tables": [],
    "Text": [],
    "Title": [],
    "Page Number": [],
    "markdown_with_ids": "..."
  }
}

Not all fields will be present in every response. The API only includes arrays for elements that were detected in the document.

Markdown Fields

Field	Location	Description
`markdown`	Top-level response	Clean markdown content without any ID attributes
`markdown_with_ids`	Inside `bounding_boxes`	Markdown with `data-bb-*` ID attributes that link text to bounding box elements

Use bounding_boxes.markdown_with_ids when you need to correlate text positions with bounding boxes. Use the top-level markdown for clean content display or export.

Example Response

Here’s a real example of the bounding_boxes object:

{
  "Images": [
    {
      "id": "img-1",
      "bounding_box": [0.0101, 0.0174, 0.0924, 0.0174, 0.0924, 0.1072, 0.0101, 0.1072],
      "confidence": "N/A",
      "page_number": 1
    }
  ],
  "Tables": [],
  "Text": [
    {
      "id": "txt-2",
      "content": "0a-NCRI",
      "original_content": "NCRI",
      "bounding_box": [0.0267, 0.0872, 0.0689, 0.0789, 0.0743, 0.0908, 0.0321, 0.0996],
      "page_number": 1,
      "average_word_confidence": 0.973
    }
  ],
  "Title": [
    {
      "id": "txt-1",
      "content": "0a-Doctor Prescription",
      "original_content": "Doctor Prescription",
      "bounding_box": [0.2196, 0.1225, 0.4578, 0.1348, 0.4557, 0.1537, 0.2174, 0.1417],
      "page_number": 1,
      "average_word_confidence": 0.995
    }
  ]
}

Field Descriptions

Text Array

Each text element contains:

id: Unique identifier (e.g., txt-1) that links to markdown_with_ids via data-bb-text-id
content: The extracted text with prefix (e.g., 0a-NCRI)
original_content: The clean extracted text without prefix
bounding_box: 8-point coordinate array (may be null for some document types)
page_number: Page where the text appears
average_word_confidence: OCR confidence score (0-1)

Title Array

Each title element contains:

id: Unique identifier linking to markdown
content: The title text with prefix
original_content: The clean title text
bounding_box: 8-point coordinate array
page_number: Page where the title appears
average_word_confidence: OCR confidence score (0-1)

Header Array

Each header element contains:

id: Unique identifier linking to markdown
content: The header text with prefix
original_content: The clean header text
bounding_box: 8-point coordinate array
page_number: Page where the header appears
average_word_confidence: OCR confidence score (0-1)

Each footer element contains:

id: Unique identifier linking to markdown
content: The footer text with prefix
original_content: The clean footer text
bounding_box: 8-point coordinate array
page_number: Page where the footer appears
average_word_confidence: OCR confidence score (0-1)

Images Array

Each image element contains:

id: Unique identifier (e.g., img-1)
bounding_box: 8-point coordinate array
page_number: Page where the image appears
confidence: Detection confidence (if available)

Tables Array

Each table element contains:

id: Unique identifier (e.g., tbl-1)
bounding_box: 8-point coordinate array
page_number: Page where the table appears
content: Table content (in HTML format)

Page Number Array

Each page number element contains:

id: Unique identifier
content: The page number text
original_content: The clean page number text
bounding_box: 8-point coordinate array
page_number: Page where it appears
average_word_confidence: OCR confidence score (0-1)

Tables are extracted and returned in HTML format, preserving the structure and making it easy to parse or display.

The id field allows you to link bounding box elements to specific locations in the markdown_with_ids field via data-bb-text-id attributes.

Converting Coordinates

To convert normalized coordinates to pixel coordinates:

def normalize_to_pixels(bbox, page_width, page_height):
    """Convert normalized bounding box to pixel coordinates."""
    return [
        bbox[0] * page_width,   # x1
        bbox[1] * page_height,  # y1
        bbox[2] * page_width,   # x2
        bbox[3] * page_height,  # y2
        bbox[4] * page_width,   # x3
        bbox[5] * page_height,  # y3
        bbox[6] * page_width,   # x4
        bbox[7] * page_height   # y4
    ]

# Example: Convert for a standard letter-size page at 72 DPI
page_width = 612  # 8.5 inches * 72 DPI
page_height = 792  # 11 inches * 72 DPI

normalized_bbox = [0.1, 0.1, 0.3, 0.1, 0.3, 0.15, 0.1, 0.15]
pixel_bbox = normalize_to_pixels(normalized_bbox, page_width, page_height)

API Reference

Endpoints

Overview

Bounding Box Format

Response Structure

Markdown Fields

Example Response

Field Descriptions

Text Array

Title Array

Header Array

Footer Array

Images Array

Tables Array

Page Number Array

Converting Coordinates

Next Steps

Extract Endpoint

Structured Output

API Reference

Endpoints

​Overview

​Bounding Box Format

​Response Structure

​Markdown Fields

​Example Response

​Field Descriptions

​Text Array

​Title Array

​Header Array

​Footer Array

​Images Array

​Tables Array

​Page Number Array

​Converting Coordinates

​Next Steps

Extract Endpoint

Structured Output

Overview

Bounding Box Format

Response Structure

Markdown Fields

Example Response

Field Descriptions

Text Array

Title Array

Header Array

Footer Array

Images Array

Tables Array

Page Number Array

Converting Coordinates

Next Steps