Error Handling - Pulse API

Overview

Proper error handling is crucial for building reliable document processing applications. This guide covers all Pulse API error codes, retry strategies, and best practices for graceful error recovery.

Error Response Format

All errors follow a consistent JSON structure:

{
  "error": {
    "code": "FILE_001",
    "message": "Invalid file type",
    "details": {
      "supported_types": ["PDF", "JPG", "PNG", "DOCX", "PPTX", "XLSX", "HTML"],
      "received_type": "DOC"
    }
  }
}

Error Categories

Authentication

AUTH_XXX - API key issues

Request

REQ_XXX - Invalid parameters

File

FILE_XXX - File format/size issues

Schema

SCHEMA_XXX - Schema validation

Billing

BILLING_XXX - Usage limit issues

Processing

PROC_XXX - Processing failures

Job

JOB_XXX - Async job issues

Storage

STORAGE_XXX - Storage issues

General

GENERAL_XXX - Server errors

Complete Error Code Reference

Authentication Errors (AUTH_XXX)

Code	HTTP	Description	Solution
`AUTH_001`	401	API key is required	Include `x-api-key` header in request
`AUTH_002`	401	Invalid API key	Verify key in Console
`AUTH_003`	401	API key expired	Generate new key in Console
`AUTH_004`	403	Insufficient permissions	Check API key permissions
`AUTH_005`	401	Organization not found	Verify your organization exists
`AUTH_006`	403	Access restricted to whitelisted domains	Contact support for domain whitelisting

Request Errors (REQ_XXX)

Code	HTTP	Description	Solution
`REQ_001`	400	No file or URL provided	Include either `file` or `file_url` parameter
`REQ_002`	400	Invalid request body	Validate JSON syntax and structure
`REQ_003`	400	Invalid parameter value	Check parameter format and allowed values
`REQ_004`	400	Missing required parameter	Check API documentation for requirements
`REQ_005`	400	Invalid chunk size	Use chunk size between 100-10000
`REQ_006`	400	Invalid page range	Use format like `1-5` or `1,3,5`

File Errors (FILE_XXX)

Code	HTTP	Description	Solution
`FILE_001`	400	Invalid file type	Use supported formats: PDF, JPG/JPEG, PNG, DOCX, PPTX, XLSX, HTML
`FILE_002`	413	File too large	Maximum file size is 100MB
`FILE_003`	400	File corrupted	Verify file integrity, re-save if needed
`FILE_004`	400	Empty file	Ensure file has content
`FILE_005`	400	Failed to download from URL	Check URL accessibility and permissions
`FILE_006`	408	Timeout downloading file	Use a faster hosting service or upload directly
`FILE_007`	400	Invalid file URL	Provide a valid, accessible URL

Schema Errors (SCHEMA_XXX)

Code	HTTP	Description	Solution
`SCHEMA_001`	400	Invalid schema format	Ensure schema conforms to JSON Schema spec
`SCHEMA_002`	400	Schema processing failed	Simplify schema or check document compatibility
`SCHEMA_003`	400	Schema too complex	Reduce nesting depth (max 5 levels)
`SCHEMA_004`	400	Unsupported schema type	Use supported data types only
`SCHEMA_005`	400	Schema validation timeout	Simplify schema or reduce document size

Billing Errors (BILLING_XXX)

Code	HTTP	Description	Solution
`BILLING_001`	403	Trial expired	Upgrade to a paid plan
`BILLING_002`	403	Page limit exceeded	Upgrade plan or wait for monthly reset
`BILLING_003`	402	Payment required	Add payment method in Console
`BILLING_004`	403	Plan limit reached	Upgrade to a higher tier
`BILLING_005`	402	Billing status unknown	Contact support
`BILLING_006`	403	Account suspended	Resolve billing issues in Console
`BILLING_007`	402	Payment failed	Update payment method
`BILLING_008`	402	Payment requires authentication	Complete 3D Secure verification
`BILLING_009`	403	Subscription canceled	Resubscribe to continue
`BILLING_010`	402	Subscription past due	Update payment method
`BILLING_011`	403	No active subscription	Subscribe to a plan
`BILLING_012`	402	Free tier limit exceeded	Upgrade required to continue
`BILLING_013`	402	Storage settings restricted on free tier	Upgrade to use storage features

Processing Errors (PROC_XXX)

Code	HTTP	Description	Solution
`PROC_001`	500	Document processing failed	Retry or contact support if persistent
`PROC_002`	408	Processing timeout	Use async endpoint or process fewer pages
`PROC_003`	500	Service temporarily unavailable	Retry after a few minutes
`PROC_004`	503	Rate limit exceeded	Implement exponential backoff
`PROC_005`	500	Partial extraction failure	Some elements couldn’t be processed

Job Errors (JOB_XXX)

Code	HTTP	Description	Solution
`JOB_001`	404	Job not found	Verify job ID; jobs expire after 48 hours
`JOB_002`	409	Job already cancelled	Job cannot be modified
`JOB_003`	410	Job expired	Resubmit the extraction request
`JOB_004`	409	Job still processing	Wait and poll again
`JOB_005`	500	Job failed	Check error details; retry if transient

Storage Errors (STORAGE_XXX)

Code	HTTP	Description	Solution
`STORAGE_001`	500	Failed to store results	Retry the extraction
`STORAGE_002`	404	Results not found	Results may have expired
`STORAGE_003`	500	Storage service unavailable	Retry after a few minutes
`STORAGE_004`	507	Storage limit exceeded	Delete old extractions or upgrade
`STORAGE_005`	404	Extraction expired	Reprocess the document

General Errors (GENERAL_XXX)

Code	HTTP	Description	Solution
`GENERAL_001`	500	Internal server error	Retry; contact support if persistent
`GENERAL_002`	503	Service unavailable	Check status page; retry later
`GENERAL_003`	504	Gateway timeout	Use async endpoint for large documents
`GENERAL_004`	429	Too many requests	Implement rate limiting with backoff
`GENERAL_005`	501	Feature not implemented	Feature not yet available

Handling Errors in Code

Basic Error Handling

from pulse import Pulse
from pulse.core.api_error import ApiError

client = Pulse(api_key="YOUR_API_KEY")

try:
    response = client.extract(
        file_url="https://www.impact-bank.com/user/file/dummy_statement.pdf"
    )
    print(f"Success! Job ID: {response.job_id}")
    
except ApiError as e:
    error_code = e.body.get("error", {}).get("code", "UNKNOWN")
    error_message = e.body.get("error", {}).get("message", "Unknown error")
    
    if error_code == "AUTH_002":
        print("Invalid API key. Check your credentials.")
    elif error_code == "FILE_001":
        print("Unsupported file type.")
    elif error_code == "FILE_002":
        print("File too large. Maximum is 100MB.")
    elif error_code.startswith("BILLING_"):
        print(f"Billing issue: {error_message}")
    elif error_code.startswith("PROC_"):
        print(f"Processing error (retry may help): {error_message}")
    else:
        print(f"API Error {error_code}: {error_message}")
        
except Exception as e:
    print(f"Unexpected error: {e}")

Comprehensive Error Handler

class PulseAPIError(Exception):
    """Custom exception for Pulse API errors."""
    
    def __init__(self, code, message, details=None):
        self.code = code
        self.message = message
        self.details = details or {}
        super().__init__(f"{code}: {message}")

class ErrorHandler:
    """Centralized error handling for Pulse API."""
    
    # Retryable error codes
    RETRYABLE_CODES = {
        "FILE_005",     # Download failed
        "FILE_006",     # Download timeout
        "PROC_001",     # Processing failed
        "PROC_002",     # Processing timeout
        "PROC_003",     # Service unavailable
        "PROC_004",     # Rate limit
        "JOB_003",      # Job expired
        "STORAGE_003",  # Storage unavailable
        "GENERAL_001",  # Server error
        "GENERAL_002",  # Service unavailable
        "GENERAL_003",  # Gateway timeout
        "GENERAL_004",  # Rate limit
    }
    
    @staticmethod
    def handle_response(response):
        """Process API response and raise appropriate errors."""
        
        if response.status_code == 200:
            return response.json()
        
        # Parse error response
        try:
            error_data = response.json().get("error", {})
            code = error_data.get("code", str(response.status_code))
            message = error_data.get("message", "Unknown error")
            details = error_data.get("details", {})
        except:
            code = str(response.status_code)
            message = response.text or "Unknown error"
            details = {}
        
        # Determine if retryable
        is_retryable = code in ErrorHandler.RETRYABLE_CODES
        
        # Create appropriate exception
        error = PulseAPIError(code, message, details)
        error.is_retryable = is_retryable
        
        raise error

Retry Strategies

Exponential Backoff

import time
import random
from pulse import Pulse
from pulse.core.api_error import ApiError

client = Pulse(api_key="YOUR_API_KEY")

RETRYABLE_CODES = {"PROC_001", "PROC_002", "PROC_003", "PROC_004", 
                   "GENERAL_001", "GENERAL_002", "GENERAL_004"}

def extract_with_retry(file_url, max_retries=3, base_delay=1):
    """Extract with exponential backoff retry."""
    
    for attempt in range(max_retries):
        try:
            return client.extract(file_url=file_url)
        except ApiError as e:
            error_code = e.body.get("error", {}).get("code", "")
            
            if error_code not in RETRYABLE_CODES or attempt == max_retries - 1:
                raise
            
            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
            print(f"Retry {attempt + 1}/{max_retries} after {delay:.1f}s")
            time.sleep(delay)
    
    raise Exception("Max retries exceeded")

# Usage
result = extract_with_retry("https://www.impact-bank.com/user/file/dummy_statement.pdf")

Circuit Breaker Pattern

from datetime import datetime, timedelta

class CircuitBreaker:
    """Prevent cascading failures with circuit breaker."""
    
    def __init__(self, failure_threshold=5, recovery_timeout=60):
        self.failure_threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = None
        self.state = "closed"  # closed, open, half-open
    
    def call(self, func):
        """Execute function with circuit breaker protection."""
        
        if self.state == "open":
            if datetime.now() - self.last_failure_time > timedelta(seconds=self.recovery_timeout):
                self.state = "half-open"
                self.failure_count = 0
            else:
                raise Exception("Circuit breaker is open")
        
        try:
            result = func()
            if self.state == "half-open":
                self.state = "closed"
            return result
        except Exception as e:
            self.failure_count += 1
            self.last_failure_time = datetime.now()
            
            if self.failure_count >= self.failure_threshold:
                self.state = "open"
                print(f"Circuit breaker opened after {self.failure_count} failures")
            
            raise

# Usage
breaker = CircuitBreaker()
try:
    result = breaker.call(lambda: client.extract(file_path="document.pdf"))
except Exception as e:
    print(f"Failed: {e}")

Intelligent Retry Logic

class SmartRetry:
    """Intelligent retry with different strategies per error type."""
    
    def __init__(self):
        self.strategies = {
            "FILE_005": self.retry_with_backoff,       # Download failed
            "FILE_006": self.retry_with_backoff,       # Download timeout
            "PROC_002": self.retry_with_smaller_chunk, # Processing timeout
            "PROC_004": self.handle_rate_limit,        # Rate limit exceeded
            "JOB_003": self.retry_with_smaller_chunk,  # Job expired
            "GENERAL_004": self.handle_rate_limit,     # Too many requests
            "GENERAL_002": self.retry_with_backoff,    # Service unavailable
        }
    
    def execute(self, func, context=None):
        """Execute with smart retry logic."""
        
        max_attempts = 3
        
        for attempt in range(max_attempts):
            try:
                return func()
            except PulseAPIError as e:
                if attempt == max_attempts - 1:
                    raise
                
                strategy = self.strategies.get(e.code, self.retry_with_backoff)
                strategy(e, attempt, context)
    
    def retry_with_backoff(self, error, attempt, context):
        """Standard exponential backoff."""
        delay = 2 ** attempt
        print(f"Retrying after {delay}s due to {error.code}")
        time.sleep(delay)
    
    def handle_rate_limit(self, error, attempt, context):
        """Handle rate limiting with longer delay."""
        print("Rate limited. Waiting 60 seconds...")
        time.sleep(60)
    
    def retry_with_smaller_chunk(self, error, attempt, context):
        """Retry with smaller page range for timeouts."""
        if context and 'pages' in context:
            # Reduce page range
            current_pages = context['pages']
            # Logic to split page range
            print(f"Retrying with smaller page range")
        time.sleep(5)

Error Recovery Patterns

Graceful Degradation

def extract_with_fallback(file_path, preferred_mode="full"):
    """Extract with graceful degradation."""
    
    strategies = [
        # Try full extraction with schema
        lambda: client.extract(
            file_path=file_path,
            schema=complex_schema,
            extract_figure=True
        ),
        # Fallback to simple extraction
        lambda: client.extract(
            file_path=file_path,
            schema=simple_schema
        ),
        # Last resort: text only
        lambda: client.extract(
            file_path=file_path
        )
    ]
    
    for i, strategy in enumerate(strategies):
        try:
            print(f"Attempting strategy {i + 1}/{len(strategies)}")
            return strategy()
        except PulseAPIError as e:
            if i == len(strategies) - 1:
                raise
            print(f"Strategy {i + 1} failed: {e.code}, trying next...")

Partial Success Handling

def process_large_document_with_recovery(file_path, total_pages=100):
    """Process document in chunks with partial success."""
    
    chunk_size = 10
    results = []
    failed_chunks = []
    
    for start in range(0, total_pages, chunk_size):
        end = min(start + chunk_size - 1, total_pages - 1)
        page_range = f"{start + 1}-{end + 1}"
        
        try:
            result = client.extract(
                file_path=file_path,
                pages=page_range
            )
            results.append({
                "pages": page_range,
                "content": result
            })
        except PulseAPIError as e:
            print(f"Failed to process pages {page_range}: {e}")
            failed_chunks.append(page_range)
    
    # Retry failed chunks with different strategy
    for chunk in failed_chunks:
        try:
            # Try with smaller chunks or different parameters
            result = client.extract(
                file_path=file_path,
                pages=chunk
            )
            results.append({
                "pages": chunk,
                "content": result,
                "recovered": True
            })
        except:
            print(f"Permanently failed: {chunk}")
    
    return results

Best Practices

Always Handle Errors

Never assume API calls will succeed
Catch and handle specific error codes
Provide meaningful error messages to users
Log errors for debugging

Implement Retry Logic

Use exponential backoff for transient errors
Set reasonable retry limits
Only retry retryable errors
Add jitter to prevent thundering herd

Monitor Error Patterns

Track error frequencies
Alert on error spikes
Analyze patterns for optimization
Review logs regularly

Graceful Degradation

Have fallback strategies
Accept partial success
Inform users of degraded functionality
Queue for later retry if appropriate

Common Error Scenarios

Scenario 1: File Upload Issues

def upload_with_validation(file_path):
    """Upload file with pre-validation."""
    
    # Check file extension
    valid_extensions = ['.pdf', '.jpg', '.jpeg', '.png', '.docx', '.pptx', '.xlsx', '.html']
    file_ext = os.path.splitext(file_path)[1].lower()
    
    if file_ext not in valid_extensions:
        raise ValueError(f"Unsupported file type: {file_ext}")
    
    # Attempt upload with retry
    return exponential_backoff_retry(
        lambda: client.upload_file(file_path)
    )

Scenario 2: Async Job Management

def manage_async_job(job_id):
    """Robustly manage async job lifecycle."""
    
    max_poll_time = 600  # 10 minutes
    poll_interval = 5
    start_time = time.time()
    
    while time.time() - start_time < max_poll_time:
        try:
            status = client.get_job_status(job_id)
            
            if status['status'] == 'completed':
                return status['result']
            elif status['status'] == 'failed':
                raise PulseAPIError(
                    "JOB_004",
                    f"Job failed: {status.get('error', 'Unknown error')}"
                )
            elif status['status'] == 'cancelled':
                raise PulseAPIError("JOB_002", "Job was cancelled")
            
            time.sleep(poll_interval)
            
        except PulseAPIError as e:
            if e.code == "JOB_001":
                # Job not found - might be eventual consistency issue
                time.sleep(10)
                continue
            raise
        except requests.exceptions.RequestException:
            # Network error - retry
            time.sleep(poll_interval)
            continue
    
    # Timeout - attempt to cancel
    try:
        client.cancel_job(job_id)
    except:
        pass
    
    raise TimeoutError(f"Job {job_id} timed out after {max_poll_time}s")

Next Steps

API Reference

See endpoint details

Getting Started

Svix Webhooks

Advanced Topics

Resources

​Overview

​Error Response Format

​Error Categories

Authentication

Request

File

Schema

Billing

Processing

Job

Storage

General

​Complete Error Code Reference

​Authentication Errors (AUTH_XXX)

​Request Errors (REQ_XXX)

​File Errors (FILE_XXX)

​Schema Errors (SCHEMA_XXX)

​Billing Errors (BILLING_XXX)

​Processing Errors (PROC_XXX)

​Job Errors (JOB_XXX)

​Storage Errors (STORAGE_XXX)

​General Errors (GENERAL_XXX)

​Handling Errors in Code

​Basic Error Handling

​Comprehensive Error Handler

​Retry Strategies

​Exponential Backoff

​Circuit Breaker Pattern

​Intelligent Retry Logic

​Error Recovery Patterns

​Graceful Degradation

​Partial Success Handling

​Best Practices

​Common Error Scenarios

​Scenario 1: File Upload Issues

​Scenario 2: Async Job Management

​Next Steps

API Reference

Overview

Error Response Format

Error Categories

Complete Error Code Reference

Authentication Errors (AUTH_XXX)

Request Errors (REQ_XXX)

File Errors (FILE_XXX)

Schema Errors (SCHEMA_XXX)

Billing Errors (BILLING_XXX)

Processing Errors (PROC_XXX)

Job Errors (JOB_XXX)

Storage Errors (STORAGE_XXX)

General Errors (GENERAL_XXX)

Handling Errors in Code

Basic Error Handling

Comprehensive Error Handler

Retry Strategies

Exponential Backoff

Circuit Breaker Pattern

Intelligent Retry Logic

Error Recovery Patterns

Graceful Degradation

Partial Success Handling

Best Practices

Common Error Scenarios

Scenario 1: File Upload Issues

Scenario 2: Async Job Management

Next Steps