Skip to main content
GET
/
job
/
{jobId}
Python SDK
from pulse import Pulse

client = Pulse(api_key="YOUR_API_KEY")

status = client.jobs.get_job(job_id="your-job-id")
print(status.status)   # "pending" | "processing" | "completed" | "failed"
if status.status == "completed":
    print(status.result)
{
  "job_id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "updated_at": "2023-11-07T05:31:56Z",
  "result": {
    "is_url": true,
    "url": "<string>"
  },
  "error": "<string>"
}

Overview

Check the status and retrieve results of an asynchronous job (e.g., submitted via /extract with async: true). Poll this endpoint periodically until the job reaches a terminal state (completed, failed, canceled, or expired).

Response

The response includes job metadata and, when completed, the full extraction results.
{
  "job_id": "abc123-def456-ghi789",
  "status": "completed",
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:31:45Z",
  "result": {
    "markdown": "# Document Title\n\nExtracted content...",
    "page_count": 15,
    "bounding_boxes": { ... },
    "plan-info": { ... }
  }
}

Response Fields

FieldTypeDescription
job_idstringUnique identifier for the extraction job.
statusstringCurrent job status: pending, processing, completed, failed, canceled, or expired.
created_atstringISO 8601 timestamp when the job was submitted.
updated_atstringISO 8601 timestamp of the last status update.
resultobjectJob output. Present when status is completed, and on expired as a retention stub. Large outputs return a { is_url, url } pointer instead of inline data (see Large Results). See Extract for result structure.
errorstringError message (only present when status is failed).

Job Status Values

StatusDescription
pendingJob is queued and waiting to be processed.
processingJob is currently being processed.
completedJob finished successfully. Results are available in the result field.
failedJob encountered an error. See error field for details.
canceledJob was canceled before completion.
expiredJob’s retention window has passed and its stored output was purged. Run a new extraction to regenerate it.

Large Results (is_url)

When the output is large (at or above 5 MB), result is replaced with a pointer instead of inline data. Fetch url to download the full result JSON.
{
  "result": {
    "is_url": true,
    "url": "https://api.runpulse.com/large_results/abc123-def456-ghi789"
  }
}
FieldTypeDescription
is_urlbooleantrue when result is a pointer rather than inline data.
urlstringLocation of the full result JSON (presigned S3 URL or Pulse proxy URL).

Polling Strategy

We recommend polling with exponential backoff:
from pulse import Pulse
import time

client = Pulse(api_key="YOUR_API_KEY")

def poll_job(job_id: str, max_attempts: int = 60):
    """
    Poll for job completion with exponential backoff.
    
    Args:
        job_id: The job ID returned from /extract with async: true
        max_attempts: Maximum number of polling attempts
        
    Returns:
        The extraction result when job completes
    """
    delay = 1  # Start with 1 second
    
    for attempt in range(max_attempts):
        # Get job status using the SDK
        response = client.jobs.get_job(job_id=job_id)
        
        if response.status == "completed":
            return response.result
        elif response.status == "failed":
            raise Exception(f"Job failed: {response.error}")
        elif response.status == "canceled":
            raise Exception("Job was canceled")
        
        # Still pending or processing - wait and retry
        print(f"Status: {response.status}, waiting {delay}s...")
        time.sleep(delay)
        delay = min(delay * 1.5, 10)  # Cap at 10 seconds
    
    raise Exception("Polling timeout")

# Example usage
job_id = "abc123-def456-ghi789"
result = poll_job(job_id)
print(f"Extraction complete! Markdown: {result['markdown'][:100]}...")

Example Usage

Check Job Status

from pulse import Pulse

client = Pulse(api_key="YOUR_API_KEY")

# Check job status
job_id = "abc123-def456-ghi789"
response = client.jobs.get_job(job_id=job_id)

print(f"Job ID: {response.job_id}")
print(f"Status: {response.status}")
print(f"Created: {response.created_at}")

if response.status == "completed":
    print(f"Markdown: {response.result['markdown'][:100]}...")
elif response.status == "failed":
    print(f"Error: {response.error}")

Complete Async Workflow

from pulse import Pulse
import time
import json

client = Pulse(api_key="YOUR_API_KEY")

# Step 1: Submit async extraction
print("Submitting async extraction...")
submit_response = client.extract(
    file_url="https://www.impact-bank.com/user/file/dummy_statement.pdf",
    async_=True
)

job_id = submit_response.job_id
print(f"Job submitted: {job_id}")

# Step 2: Poll for completion
delay = 1
while True:
    status_response = client.jobs.get_job(job_id=job_id)
    
    if status_response.status == "completed":
        print("Extraction complete!")
        extraction_id = status_response.result["extraction_id"]
        print(f"Extraction ID: {extraction_id}")
        break
    elif status_response.status in ["failed", "canceled"]:
        print(f"Job ended with status: {status_response.status}")
        break
    
    print(f"Status: {status_response.status}")
    time.sleep(delay)
    delay = min(delay * 1.5, 10)

# Step 3 (optional): Apply schema via /schema endpoint
schema_result = client.schema(
    extraction_id=extraction_id,
    schema_config={
        "input_schema": {
            "type": "object",
            "properties": {
                "account_holder": {"type": "string"},
                "balance": {"type": "number"}
            }
        }
    }
)
print(f"Schema output: {schema_result.schema_output}")
For webhook-based notifications instead of polling, see the Webhooks documentation.

Authorizations

x-api-key
string
header
required

API key for authentication

Path Parameters

jobId
string
required

Identifier returned from an async job submission.

Response

Current job status payload

Current status and metadata for an asynchronous job.

job_id
string
required

Identifier assigned to the asynchronous job.

status
enum<string>
required

Lifecycle status for an asynchronous job.

Available options:
pending,
processing,
completed,
failed,
canceled,
expired
created_at
string<date-time>
required

Timestamp when the job was accepted.

updated_at
string<date-time>

Timestamp of the last status update, if available.

result
object

Job output. Present when completed, and on expired as a retention stub. Large outputs return a { is_url, url } pointer (see LargeResultStub) instead of inline data.

error
string

Error message describing why the job failed, if applicable.