> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Download Large Result

> Download the full result for a large extraction. When `/extract`
or `GET /job/{jobId}` returns `is_url: true`, fetch the complete
result from the URL provided.  The URL is single-use: after a
successful download the resource is deleted and subsequent
requests return 410 Gone.

For form jobs (`/form/detect`, `/form/fill`, `/form/clear`)
you don't need this endpoint at all — `GET /job/{jobId}`
already returns the full `FormResult` inline under `result`,
and the `pdf_url` field points at
[GET /results/{jobId}/pdf](api:GET/results/{jobId}/pdf) for the
binary.

## Overview

Large async results may be returned as a URL instead of an inline response body. Use this endpoint to download the full completed result for a job when the job response indicates a large-result URL.

<Note>
  Large-result links are intended for completed jobs. Poll `GET /job/{jobId}` first, then download the result URL if the job response says the result is URL-backed.
</Note>

## Typical Flow

```mermaid theme={null}
flowchart LR
    A["POST /extract async=true"] --> B["job_id"]
    B --> C["GET /job/{jobId}"]
    C --> D{Large result?}
    D -->|Yes| E["GET /large_results/{jobId}"]
    D -->|No| F[Use inline result]
```

## Related

<CardGroup cols={2}>
  <Card title="Working With Large Documents" icon="file-circle-info" href="/api-reference/large-documents">
    Plan long-running and large-output workflows.
  </Card>

  <Card title="Poll Job" icon="clock" href="/api-reference/endpoint/poll">
    Check async job status.
  </Card>
</CardGroup>


## OpenAPI

````yaml GET /large_results/{jobId}
openapi: 3.1.0
info:
  title: Pulse API Structure
  version: 0.1.0
  description: >-
    Canonical contract for the Pulse extraction APIs. This specification is the
    single source of truth for shared request/response models that client and
    server packages consume.
servers:
  - url: https://api.runpulse.com
    description: Default Pulse API base URL
security:
  - ApiKey: []
paths:
  /large_results/{jobId}:
    get:
      tags:
        - LargeResults
      summary: Download large extraction result
      description: |-
        Download the full result for a large extraction. When `/extract`
        or `GET /job/{jobId}` returns `is_url: true`, fetch the complete
        result from the URL provided.  The URL is single-use: after a
        successful download the resource is deleted and subsequent
        requests return 410 Gone.

        For form jobs (`/form/detect`, `/form/fill`, `/form/clear`)
        you don't need this endpoint at all — `GET /job/{jobId}`
        already returns the full `FormResult` inline under `result`,
        and the `pdf_url` field points at
        [GET /results/{jobId}/pdf](api:GET/results/{jobId}/pdf) for the
        binary.
      operationId: getLargeResult
      parameters:
        - name: jobId
          in: path
          required: true
          description: Job identifier from the extraction response.
          schema:
            type: string
      responses:
        '200':
          description: Full extraction result (streamed JSON)
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractResultCore'
        '404':
          description: Job not found
        '410':
          description: Result already downloaded or expired
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: object
                    properties:
                      code:
                        type: string
                        example: GONE
                      message:
                        type: string
                        example: Result already fetched or expired
        '500':
          description: Internal server error
components:
  schemas:
    ExtractResultCore:
      type: object
      description: >-
        Core extraction result fields shared by the synchronous `/extract`
        endpoint and the pipeline extract step.
      properties:
        markdown:
          type: string
          description: >-
            Primary markdown content extracted from the document. Always present
            in the new format.
        extensions:
          type: object
          description: >-
            Output from enabled extensions. Each key corresponds to an extension
            that was enabled in the request under `extensions.*`. Only keys for
            enabled extensions are present.
          properties:
            chunking:
              type: object
              description: >-
                Chunk results by strategy. Present when `extensions.chunking`
                was provided in the request.
              properties:
                semantic:
                  type: array
                  items:
                    type: string
                  description: Semantically-segmented chunks.
                header:
                  type: array
                  items:
                    type: string
                  description: Chunks split by document headers/headings.
                page:
                  type: array
                  items:
                    type: string
                  description: One chunk per page.
                recursive:
                  type: array
                  items:
                    type: string
                  description: Recursively-split chunks respecting size limits.
            footnoteReferences:
              type: array
              x-fern-property-name: footnote_references
              description: >-
                List of detected footnotes with their in-text references.
                Present when `extensions.footnoteReferences` was enabled. Each
                item links a footnote paragraph to the body-text paragraphs that
                reference it, using bounding-box text IDs.
              items:
                type: object
                properties:
                  symbol:
                    type: string
                    description: The footnote marker symbol (e.g. "*", "†", "1", "#").
                  footnoteTextId:
                    type: string
                    description: >-
                      The bounding-box text ID (e.g. "txt-15") of the footnote
                      explanation paragraph.
                  referenceTextIds:
                    type: array
                    description: >-
                      Bounding-box text IDs of body-text paragraphs that contain
                      a reference to this footnote marker.
                    items:
                      type: string
            altOutputs:
              type: object
              x-fern-property-name: alt_outputs
              description: >-
                Alternate output formats. Each key corresponds to an enabled alt
                output.
              properties:
                wlbb:
                  type: object
                  description: >-
                    Word-level bounding box data. Present when
                    `extensions.altOutputs.wlbb` was enabled and input is a PDF.
                  properties:
                    words:
                      type: array
                      description: List of detected words with their positions.
                      items:
                        type: object
                        properties:
                          id:
                            type: string
                            description: >-
                              Unique identifier for the word (e.g. "w-1", "w-2",
                              …).
                          text:
                            type: string
                            description: The recognised word text.
                          page_number:
                            type: integer
                            minimum: 1
                            description: 1-indexed page number in the original document.
                          bounding_box:
                            type: array
                            description: >-
                              Flat 4-corner polygon: [x1,y1, x2,y2, x3,y3,
                              x4,y4]. All coordinates normalised to 0–1 range.
                            items:
                              type: number
                            minItems: 8
                            maxItems: 8
                          average_word_confidence:
                            type: number
                            description: Recognition confidence score (0–1).
                    error:
                      type: string
                      description: Error message if word-level extraction failed.
                html:
                  type: string
                  description: >-
                    HTML representation of the document. Present when
                    `extensions.altOutputs.returnHtml` was enabled.
                xml:
                  type: string
                  description: >-
                    XML representation of the document. Present when
                    `extensions.altOutputs.returnXml` was enabled. (WIP)
        bounding_boxes:
          allOf:
            - $ref: '#/components/schemas/BoundingBoxes'
          description: >-
            Positional bounding-box data for text, titles, headers, footers,
            images, and tables. `Images` carries chart/image visuals (with
            `image_url` when `figure_processing.show_images` is enabled),
            `Tables` the detected tables, and `Text`/`Title`/`Footer` the
            paragraph/title/footer regions. Additional keys (e.g.
            `markdown_with_ids`, `defined_names`) round-trip without being
            typed.
        extraction_id:
          type: string
          format: uuid
          description: >-
            Persisted extraction ID. Present when storage is enabled (default).
            Use this ID with `/split` and `/schema` endpoints.
        extraction_url:
          type: string
          description: >-
            URL to view the extraction on the Pulse platform. Present when
            storage is enabled.
        page_count:
          type: integer
          minimum: 1
          description: Number of pages processed.
        plan_info:
          allOf:
            - $ref: '#/components/schemas/PlanInfo'
          description: >-
            Billing tier and cumulative usage information. Includes
            `total_credits_used` (primary billing metric) and `pages_used`
            (legacy compatibility).
        warnings:
          type: array
          items:
            type: string
          description: >-
            Non-fatal warnings generated during extraction. Includes deprecation
            notices when legacy input parameters are used, as well as processing
            warnings (e.g. word-level bounding box limitations).
        credits_used:
          type: number
          format: float
          nullable: true
          description: >-
            Number of credits consumed by this request. Only present when the
            organization has the credit billing system enabled.
    BoundingBoxes:
      type: object
      description: >-
        Positional bounding-box data for text, titles, headers, footers, images,
        and tables. Used by the frontend for annotation overlays and by SDK
        consumers to access detected visuals (charts and images) returned under
        `Images`. Keys not listed here are accepted for forward compatibility
        (e.g. `markdown_with_ids`, `defined_names`).
      properties:
        Images:
          type: array
          items:
            $ref: '#/components/schemas/BoundingBoxImage'
          description: >-
            Detected or embedded visuals. `image_url` is populated when
            `figure_processing.show_images` was enabled and the entry's
            `visual_type` matched the requested target.
        Tables:
          type: array
          items:
            $ref: '#/components/schemas/BoundingBoxTable'
          description: >-
            Detected tables. Each entry carries a `table_info` block plus
            optional `cell_data`.
        Text:
          type: array
          items:
            $ref: '#/components/schemas/BoundingBoxItem'
          description: >-
            Body-text paragraphs and detected text regions. For spreadsheets,
            includes free-form text outside table regions (titles, captions,
            instructions).
        Title:
          type: array
          items:
            $ref: '#/components/schemas/BoundingBoxItem'
          description: >-
            Detected title regions. Spreadsheets emit one `Sheet: <name>` title
            per processed sheet.
        Footer:
          type: array
          items:
            $ref: '#/components/schemas/BoundingBoxItem'
          description: >-
            Detected footer regions (e.g. spreadsheet "Totals" rows, PDF page
            footers).
        markdown_with_ids:
          type: string
          description: >-
            Markdown variant with stable `data-bb-*-id` attributes in figure /
            table / text tags. Lets clients join rendered HTML back to
            bounding-box entries by id.
      additionalProperties: true
    PlanInfo:
      type: object
      description: >-
        Cumulative billing snapshot for the calling organization. Sourced from
        the `pulse-org-stats` aggregate table maintained asynchronously by the
        org-stats Lambda; the in-flight request's contribution is added on top
        so every response reflects post-request state. Returned by every
        endpoint that consumes credits (extract, schema, tables, split, form,
        and their batch / pipeline equivalents).
      properties:
        tier:
          type: string
          description: Billing tier, e.g. `"trial"`, `"growth"`, `"pulse_ultra_2"`.
        total_credits_used:
          type: number
          format: float
          description: >-
            Total credits consumed by the organization to date, including this
            request. The primary billing metric going forward.
        pages_used:
          type: integer
          minimum: 0
          description: >-
            Total pages processed by the organization to date, including this
            request. Kept for backward compatibility with clients that haven't
            migrated to `total_credits_used`.
        note:
          type: string
          description: >-
            Optional human-readable note about billing state for this response
            (e.g. trial credits remaining). Omitted when no note applies.
    BoundingBoxImage:
      type: object
      description: >-
        Detected or embedded visual (chart or image) returned under
        `bounding_boxes.Images`. For PDFs/images, populated when figure
        detection is enabled. For spreadsheets, populated for embedded charts
        and images. `image_url` is set when `figure_processing.show_images` is
        enabled.
      required:
        - id
        - visual_type
      properties:
        id:
          type: string
          description: >-
            Stable visual identifier (e.g. `excel_image_1_1`, `fig-3`). Use this
            to join with the `data-bb-image-id` attribute in the markdown
            output.
        content:
          type: string
          description: >-
            Short caption for the visual (e.g. `Chart: Revenue`). Populated by
            the spreadsheet parser; PDF figures may leave this empty when no
            caption is detected.
        visual_type:
          allOf:
            - $ref: '#/components/schemas/VisualType'
          description: >-
            Visual class. Drives whether `figure_processing.description` /
            `figure_processing.show_images` apply to this entry.
        page_number:
          type: integer
          minimum: 1
          description: 1-indexed page or sheet index.
        bounding_box:
          type: array
          items:
            type: number
          description: >-
            Document coordinate polygon when available. Spreadsheet visuals
            typically use an empty array and rely on `excel_range` for
            positioning.
        image_url:
          type: string
          description: >-
            Pulse-hosted URL for the visual image bytes. Present when
            `figure_processing.show_images` was enabled for this visual_type.
            Spreadsheet visuals proxy through `GET
            /results/{jobId}/images/{filename}`; inline figure flows may use a
            data URI.
        description:
          type: string
          description: >-
            Generated visual description. Present when
            `figure_processing.description` was enabled for this visual_type.
        classification:
          type: object
          description: >-
            Visual classification metadata when classification was run. Includes
            confidence, model name, and any non-fatal classification error.
          additionalProperties: true
          properties:
            confidence:
              type: number
              minimum: 0
              maximum: 1
            model:
              type: string
            error:
              type: string
        sheet_name:
          type: string
          description: Spreadsheet-only sheet name.
        sheet_index:
          type: integer
          description: Spreadsheet-only parsed sheet index after hidden-sheet filtering.
        workbook_sheet_index:
          type: integer
          description: Spreadsheet-only original workbook sheet index.
        excel_range:
          type: string
          description: Spreadsheet-only anchor or covered cell range for the visual.
        chart_type:
          type: string
          description: >-
            Spreadsheet chart class name (e.g. `BarChart`, `LineChart`) when
            `visual_type` is `chart`.
        chart_title:
          type: string
          description: Spreadsheet chart title when available.
        source_ranges:
          type: array
          items:
            type: string
          description: >-
            Spreadsheet chart source ranges when available, e.g.
            `["Charts!$B$1:$B$3"]`.
        render_error:
          type: string
          description: >-
            Optional non-fatal rendering error for spreadsheet visuals. When
            set, the visual entry is still returned but `image_url` may be
            omitted.
        description_error:
          type: string
          description: Optional non-fatal description-generation error.
      additionalProperties: true
    BoundingBoxTable:
      type: object
      description: >-
        Detected table region returned under `bounding_boxes.Tables`. Carries a
        structured `table_info` block plus optional `cell_data`. Kept loose with
        `additionalProperties: true` to round-trip future fields the server may
        add (e.g. layout metadata).
      properties:
        table_info:
          type: object
          additionalProperties: true
          properties:
            id:
              type: string
            dimensions:
              type: array
              items:
                type: integer
            excel_range:
              type: string
            sheet_name:
              type: string
            sheet_index:
              type: integer
            workbook_sheet_index:
              type: integer
            section_index:
              type: integer
            section_type:
              type: string
            section_name:
              type: string
            table_name:
              type: string
            layout_type:
              type: string
            is_chart:
              type: boolean
            chart_type:
              type: string
            chart_title:
              type: string
            source_ranges:
              type: array
              items:
                type: string
            location:
              type: object
              additionalProperties: true
        cell_data:
          type: array
          description: >-
            Table cell payload when available. Shape varies by extraction path;
            SDK consumers should treat as opaque.
          items:
            type: object
            additionalProperties: true
      additionalProperties: true
    BoundingBoxItem:
      type: object
      description: >-
        Common base shape used for `Text`, `Title`, and `Footer` bounding-box
        entries. Spreadsheet rows may carry additional sheet-scoped fields such
        as `excel_range` and `sheet_name`; forward-compatible extra properties
        are accepted.
      properties:
        id:
          type: string
          description: >-
            Stable bounding-box identifier (e.g. `txt-1`, `excel_title_0`). Used
            by extensions like `footnoteReferences` to cross-reference
            paragraphs.
        content:
          type: string
          description: Plain-text content of the region.
        page_number:
          type: integer
          minimum: 1
          description: >-
            1-indexed page number. For spreadsheets this matches the parsed
            sheet index after hidden-sheet filtering.
        bounding_box:
          type: array
          items:
            type: number
          description: >-
            Document coordinate polygon when available. Spreadsheet visuals may
            use an empty array and `excel_range` instead.
        excel_range:
          type: string
          description: Spreadsheet-only anchor or covered cell range.
        sheet_name:
          type: string
          description: Spreadsheet-only sheet name.
        sheet_index:
          type: integer
          description: Spreadsheet-only parsed sheet index after hidden-sheet filtering.
        workbook_sheet_index:
          type: integer
          description: Spreadsheet-only original workbook sheet index.
      additionalProperties: true
    VisualType:
      type: string
      enum:
        - chart
        - image
      description: >-
        Visual class of a bounding-box image. `chart` covers data visualizations
        (bar/line/pie etc., including spreadsheet chart objects). `image` covers
        non-chart embedded or detected visuals (logos, photos, screenshots).
  securitySchemes:
    ApiKey:
      type: apiKey
      in: header
      name: x-api-key
      x-fern-header:
        name: apiKey
        env: PULSE_API_KEY

````