Overview
When extracting content with layout information, Pulse API returns bounding box coordinates for text, tables, and images. This spatial data enables precise document understanding and region-based extraction.Bounding Box Format
Bounding boxes are returned as normalized coordinates (0-1 range) in an 8-point format:- (x1, y1) = Top-left corner
- (x2, y2) = Top-right corner
- (x3, y3) = Bottom-right corner
- (x4, y4) = Bottom-left corner
Coordinates are normalized to 0-1 range, making them resolution-independent. To convert to pixels, multiply by the page width/height.
Response Structure
The extraction response includes the following arrays:Not all fields will be present in every response. The API only includes arrays for elements that were detected in the document.
Example Response
Here’s a real example of bounding box data:Field Descriptions
Text Array
Each text element contains:content
: The extracted textbounding_box
: 8-point coordinate arraypage_number
: Page where the text appearsaverage_word_confidence
: OCR confidence score (0-1)
Title Array
Each title element contains:content
: The title textbounding_box
: 8-point coordinate arraypage_number
: Page where the title appearsaverage_word_confidence
: OCR confidence score (0-1)
Header Array
Each header element contains:content
: The header textbounding_box
: 8-point coordinate arraypage_number
: Page where the header appearsaverage_word_confidence
: OCR confidence score (0-1)
Footer Array
Each footer element contains:content
: The footer textbounding_box
: 8-point coordinate arraypage_number
: Page where the footer appearsaverage_word_confidence
: OCR confidence score (0-1)
Images Array
Each image element contains:bounding_box
: 8-point coordinate arraypage_number
: Page where the image appearsconfidence
: Detection confidence (if available)
Tables Array
Each table element contains:bounding_box
: 8-point coordinate arraypage_number
: Page where the table appearscontent
: Table content (in HTML format)
Page Number Array
Each page number element contains:content
: The page number textbounding_box
: 8-point coordinate arraypage_number
: Page where it appearsaverage_word_confidence
: OCR confidence score (0-1)
Tables are extracted and returned in HTML format, preserving the structure and making it easy to parse or display.