> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Bounding Boxes

> Understanding layout information from document extraction

## Overview

When extracting content with layout information, Pulse API returns bounding box coordinates for text, tables, and images. This spatial data enables precise document understanding and region-based extraction.

## Bounding Box Format

Bounding boxes are returned as normalized coordinates (0-1 range) in an 8-point format:

```
[x1, y1, x2, y2, x3, y3, x4, y4]
```

Where:

* **(x1, y1)** = Top-left corner
* **(x2, y2)** = Top-right corner
* **(x3, y3)** = Bottom-right corner
* **(x4, y4)** = Bottom-left corner

<Note>
  Coordinates are normalized to 0-1 range, making them resolution-independent. To convert to pixels, multiply by the page width/height.
</Note>

## Response Structure

The `bounding_boxes` object in the extraction response contains:

```json theme={null}
{
  "bounding_boxes": {
    "Footer": [],
    "Header": [],
    "Images": [],
    "Tables": [],
    "Text": [],
    "Title": [],
    "Page Number": [],
    "markdown_with_ids": "..."
  }
}
```

<Note>
  Not all fields will be present in every response. The API only includes arrays for elements that were detected in the document.
</Note>

### Markdown Fields

| Field               | Location                | Description                                                                     |
| ------------------- | ----------------------- | ------------------------------------------------------------------------------- |
| `markdown`          | Top-level response      | Clean markdown content without any ID attributes                                |
| `markdown_with_ids` | Inside `bounding_boxes` | Markdown with `data-bb-*` ID attributes that link text to bounding box elements |

Use `bounding_boxes.markdown_with_ids` when you need to correlate text positions with bounding boxes. Use the top-level `markdown` for clean content display or export.

## Example Response

Here's a real example of the `bounding_boxes` object:

```json theme={null}
{
  "Images": [
    {
      "id": "img-1",
      "bounding_box": [0.0101, 0.0174, 0.0924, 0.0174, 0.0924, 0.1072, 0.0101, 0.1072],
      "confidence": "N/A",
      "page_number": 1
    }
  ],
  "Tables": [],
  "Text": [
    {
      "id": "txt-2",
      "content": "0a-NCRI",
      "original_content": "NCRI",
      "bounding_box": [0.0267, 0.0872, 0.0689, 0.0789, 0.0743, 0.0908, 0.0321, 0.0996],
      "page_number": 1,
      "average_word_confidence": 0.973
    }
  ],
  "Title": [
    {
      "id": "txt-1",
      "content": "0a-Doctor Prescription",
      "original_content": "Doctor Prescription",
      "bounding_box": [0.2196, 0.1225, 0.4578, 0.1348, 0.4557, 0.1537, 0.2174, 0.1417],
      "page_number": 1,
      "average_word_confidence": 0.995
    }
  ]
}
```

## Field Descriptions

### Text Array

Each text element contains:

* `id`: Unique identifier (e.g., `txt-1`) that links to `markdown_with_ids` via `data-bb-text-id`
* `content`: The extracted text with prefix (e.g., `0a-NCRI`)
* `original_content`: The clean extracted text without prefix
* `bounding_box`: 8-point coordinate array (may be `null` for some document types)
* `page_number`: Page where the text appears
* `average_word_confidence`: OCR confidence score (0-1)

### Title Array

Each title element contains:

* `id`: Unique identifier linking to markdown
* `content`: The title text with prefix
* `original_content`: The clean title text
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where the title appears
* `average_word_confidence`: OCR confidence score (0-1)

### Header Array

Each header element contains:

* `id`: Unique identifier linking to markdown
* `content`: The header text with prefix
* `original_content`: The clean header text
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where the header appears
* `average_word_confidence`: OCR confidence score (0-1)

### Footer Array

Each footer element contains:

* `id`: Unique identifier linking to markdown
* `content`: The footer text with prefix
* `original_content`: The clean footer text
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where the footer appears
* `average_word_confidence`: OCR confidence score (0-1)

### Images Array

Each image element contains:

* `id`: Unique identifier (e.g., `img-1`)
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where the image appears
* `confidence`: Detection confidence (if available)

### Tables Array

Each table element contains:

* `id`: Unique identifier (e.g., `tbl-1`)
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where the table appears
* `content`: Table content (in HTML format)

### Page Number Array

Each page number element contains:

* `id`: Unique identifier
* `content`: The page number text
* `original_content`: The clean page number text
* `bounding_box`: 8-point coordinate array
* `page_number`: Page where it appears
* `average_word_confidence`: OCR confidence score (0-1)

<Note>
  Tables are extracted and returned in HTML format, preserving the structure and making it easy to parse or display.
</Note>

<Note>
  The `id` field allows you to link bounding box elements to specific locations in the `markdown_with_ids` field via `data-bb-text-id` attributes.
</Note>

## Footnote References

When you enable `extensions.footnote_references` in your extract request, the response includes an `extensions.footnoteReferences` array that uses bounding box IDs to link footnote markers to their in-text references.

Each entry contains:

* `symbol` — the footnote marker (e.g. `*`, `†`, `‡`, `1`)
* `footnoteTextId` — the `id` of the footnote explanation, typically found in the `Footer` array
* `referenceTextIds` — an array of `id` values from the `Text`, `Title`, or `Header` arrays identifying body paragraphs that contain the marker

```json theme={null}
{
  "extensions": {
    "footnoteReferences": [
      {
        "symbol": "*",
        "footnoteTextId": "txt-42",
        "referenceTextIds": ["txt-5", "txt-12"]
      }
    ]
  }
}
```

Use `footnoteTextId` to look up the footnote's position and content in `bounding_boxes.Footer` (or `bounding_boxes.Text`), and each entry in `referenceTextIds` to locate the citing paragraphs in `bounding_boxes.Text`, `bounding_boxes.Title`, or `bounding_boxes.Header`. This allows you to spatially highlight both the footnote and every place in the document that references it.

<Note>
  Footnote references are only available for PDF documents. See the [Extract endpoint](/api-reference/endpoint/extract#footnote-references) for usage examples.
</Note>

## Converting Coordinates

To convert normalized coordinates to pixel coordinates:

```python theme={null}
def normalize_to_pixels(bbox, page_width, page_height):
    """Convert normalized bounding box to pixel coordinates."""
    return [
        bbox[0] * page_width,   # x1
        bbox[1] * page_height,  # y1
        bbox[2] * page_width,   # x2
        bbox[3] * page_height,  # y2
        bbox[4] * page_width,   # x3
        bbox[5] * page_height,  # y3
        bbox[6] * page_width,   # x4
        bbox[7] * page_height   # y4
    ]

# Example: Convert for a standard letter-size page at 72 DPI
page_width = 612  # 8.5 inches * 72 DPI
page_height = 792  # 11 inches * 72 DPI

normalized_bbox = [0.1, 0.1, 0.3, 0.1, 0.3, 0.15, 0.1, 0.15]
pixel_bbox = normalize_to_pixels(normalized_bbox, page_width, page_height)
```

## Next Steps

<CardGroup cols={2}>
  <Card title="Extract Endpoint" icon="code" href="/api-reference/endpoint/extract">
    Enable bounding box extraction
  </Card>

  <Card title="Structured Output" icon="table" href="/api-reference/structured-output-guidelines">
    Combine with structured data
  </Card>
</CardGroup>
