Skip to main content

Overview

Proper error handling is crucial for building reliable document processing applications. This guide covers all Pulse API error codes, retry strategies, and best practices for graceful error recovery.

Error Response Format

All errors follow a consistent JSON structure:
{
  "error": {
    "code": "FILE_001",
    "message": "Invalid file type",
    "details": {
      "supported_types": ["PDF", "JPG", "PNG", "DOCX", "PPTX", "XLSX", "HTML"],
      "received_type": "DOC"
    }
  }
}

Error Categories

Authentication

AUTH_XXX - API key issues

Request

REQ_XXX - Invalid parameters

File

FILE_XXX - File format/size issues

Schema

SCHEMA_XXX - Schema validation

Billing

BILLING_XXX - Usage limit issues

Processing

PROC_XXX - Processing failures

Job

JOB_XXX - Async job issues

Storage

STORAGE_XXX - Storage issues

General

GENERAL_XXX - Server errors

Complete Error Code Reference

Authentication Errors (AUTH_XXX)

CodeHTTPDescriptionSolution
AUTH_001401API key is requiredInclude x-api-key header in request
AUTH_002401Invalid API keyVerify key in Console
AUTH_003401API key expiredGenerate new key in Console
AUTH_004403Insufficient permissionsCheck API key permissions
AUTH_005401Organization not foundVerify your organization exists
AUTH_006403Access restricted to whitelisted domainsContact support for domain whitelisting

Request Errors (REQ_XXX)

CodeHTTPDescriptionSolution
REQ_001400No file or URL providedInclude either file or file_url parameter
REQ_002400Invalid request bodyValidate JSON syntax and structure
REQ_003400Invalid parameter valueCheck parameter format and allowed values
REQ_004400Missing required parameterCheck API documentation for requirements
REQ_005400Invalid chunk sizeUse chunk size between 100-10000
REQ_006400Invalid page rangeUse format like 1-5 or 1,3,5

File Errors (FILE_XXX)

CodeHTTPDescriptionSolution
FILE_001400Invalid file typeUse supported formats: PDF, JPG/JPEG, PNG, DOCX, PPTX, XLSX, HTML
FILE_002413File too largeMaximum file size is 100MB
FILE_003400File corruptedVerify file integrity, re-save if needed
FILE_004400Empty fileEnsure file has content
FILE_005400Failed to download from URLCheck URL accessibility and permissions
FILE_006408Timeout downloading fileUse a faster hosting service or upload directly
FILE_007400Invalid file URLProvide a valid, accessible URL

Schema Errors (SCHEMA_XXX)

CodeHTTPDescriptionSolution
SCHEMA_001400Invalid schema formatEnsure schema conforms to JSON Schema spec
SCHEMA_002400Schema processing failedSimplify schema or check document compatibility
SCHEMA_003400Schema too complexReduce nesting depth (max 5 levels)
SCHEMA_004400Unsupported schema typeUse supported data types only
SCHEMA_005400Schema validation timeoutSimplify schema or reduce document size

Billing Errors (BILLING_XXX)

CodeHTTPDescriptionSolution
BILLING_001403Trial expiredUpgrade to a paid plan
BILLING_002403Page limit exceededUpgrade plan or wait for monthly reset
BILLING_003402Payment requiredAdd payment method in Console
BILLING_004403Plan limit reachedUpgrade to a higher tier
BILLING_005402Billing status unknownContact support
BILLING_006403Account suspendedResolve billing issues in Console
BILLING_007402Payment failedUpdate payment method
BILLING_008402Payment requires authenticationComplete 3D Secure verification
BILLING_009403Subscription canceledResubscribe to continue
BILLING_010402Subscription past dueUpdate payment method
BILLING_011403No active subscriptionSubscribe to a plan
BILLING_012402Free tier limit exceededUpgrade required to continue
BILLING_013402Storage settings restricted on free tierUpgrade to use storage features

Processing Errors (PROC_XXX)

CodeHTTPDescriptionSolution
PROC_001500Document processing failedRetry or contact support if persistent
PROC_002408Processing timeoutUse async endpoint or process fewer pages
PROC_003500Service temporarily unavailableRetry after a few minutes
PROC_004503Rate limit exceededImplement exponential backoff
PROC_005500Partial extraction failureSome elements couldn’t be processed

Job Errors (JOB_XXX)

CodeHTTPDescriptionSolution
JOB_001404Job not foundVerify job ID; jobs expire after 48 hours
JOB_002409Job already cancelledJob cannot be modified
JOB_003410Job expiredResubmit the extraction request
JOB_004409Job still processingWait and poll again
JOB_005500Job failedCheck error details; retry if transient

Storage Errors (STORAGE_XXX)

CodeHTTPDescriptionSolution
STORAGE_001500Failed to store resultsRetry the extraction
STORAGE_002404Results not foundResults may have expired
STORAGE_003500Storage service unavailableRetry after a few minutes
STORAGE_004507Storage limit exceededDelete old extractions or upgrade
STORAGE_005404Extraction expiredReprocess the document

General Errors (GENERAL_XXX)

CodeHTTPDescriptionSolution
GENERAL_001500Internal server errorRetry; contact support if persistent
GENERAL_002503Service unavailableCheck status page; retry later
GENERAL_003504Gateway timeoutUse async endpoint for large documents
GENERAL_004429Too many requestsImplement rate limiting with backoff
GENERAL_005501Feature not implementedFeature not yet available

Handling Errors in Code

Basic Error Handling

from pulse import Pulse
from pulse.core.api_error import ApiError

client = Pulse(api_key="YOUR_API_KEY")

try:
    response = client.extract(
        file_url="https://www.impact-bank.com/user/file/dummy_statement.pdf"
    )
    print(f"Success! Job ID: {response.job_id}")
    
except ApiError as e:
    error_code = e.body.get("error", {}).get("code", "UNKNOWN")
    error_message = e.body.get("error", {}).get("message", "Unknown error")
    
    if error_code == "AUTH_002":
        print("Invalid API key. Check your credentials.")
    elif error_code == "FILE_001":
        print("Unsupported file type.")
    elif error_code == "FILE_002":
        print("File too large. Maximum is 100MB.")
    elif error_code.startswith("BILLING_"):
        print(f"Billing issue: {error_message}")
    elif error_code.startswith("PROC_"):
        print(f"Processing error (retry may help): {error_message}")
    else:
        print(f"API Error {error_code}: {error_message}")
        
except Exception as e:
    print(f"Unexpected error: {e}")

Comprehensive Error Handler

class PulseAPIError(Exception):
    """Custom exception for Pulse API errors."""
    
    def __init__(self, code, message, details=None):
        self.code = code
        self.message = message
        self.details = details or {}
        super().__init__(f"{code}: {message}")

class ErrorHandler:
    """Centralized error handling for Pulse API."""
    
    # Retryable error codes
    RETRYABLE_CODES = {
        "FILE_005",     # Download failed
        "FILE_006",     # Download timeout
        "PROC_001",     # Processing failed
        "PROC_002",     # Processing timeout
        "PROC_003",     # Service unavailable
        "PROC_004",     # Rate limit
        "JOB_003",      # Job expired
        "STORAGE_003",  # Storage unavailable
        "GENERAL_001",  # Server error
        "GENERAL_002",  # Service unavailable
        "GENERAL_003",  # Gateway timeout
        "GENERAL_004",  # Rate limit
    }
    
    @staticmethod
    def handle_response(response):
        """Process API response and raise appropriate errors."""
        
        if response.status_code == 200:
            return response.json()
        
        # Parse error response
        try:
            error_data = response.json().get("error", {})
            code = error_data.get("code", str(response.status_code))
            message = error_data.get("message", "Unknown error")
            details = error_data.get("details", {})
        except:
            code = str(response.status_code)
            message = response.text or "Unknown error"
            details = {}
        
        # Determine if retryable
        is_retryable = code in ErrorHandler.RETRYABLE_CODES
        
        # Create appropriate exception
        error = PulseAPIError(code, message, details)
        error.is_retryable = is_retryable
        
        raise error

Retry Strategies

Exponential Backoff

import time
import random
from pulse import Pulse
from pulse.core.api_error import ApiError

client = Pulse(api_key="YOUR_API_KEY")

RETRYABLE_CODES = {"PROC_001", "PROC_002", "PROC_003", "PROC_004", 
                   "GENERAL_001", "GENERAL_002", "GENERAL_004"}

def extract_with_retry(file_url, max_retries=3, base_delay=1):
    """Extract with exponential backoff retry."""
    
    for attempt in range(max_retries):
        try:
            return client.extract(file_url=file_url)
        except ApiError as e:
            error_code = e.body.get("error", {}).get("code", "")
            
            if error_code not in RETRYABLE_CODES or attempt == max_retries - 1:
                raise
            
            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
            print(f"Retry {attempt + 1}/{max_retries} after {delay:.1f}s")
            time.sleep(delay)
    
    raise Exception("Max retries exceeded")

# Usage
result = extract_with_retry("https://www.impact-bank.com/user/file/dummy_statement.pdf")

Circuit Breaker Pattern

from datetime import datetime, timedelta

class CircuitBreaker:
    """Prevent cascading failures with circuit breaker."""
    
    def __init__(self, failure_threshold=5, recovery_timeout=60):
        self.failure_threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = None
        self.state = "closed"  # closed, open, half-open
    
    def call(self, func):
        """Execute function with circuit breaker protection."""
        
        if self.state == "open":
            if datetime.now() - self.last_failure_time > timedelta(seconds=self.recovery_timeout):
                self.state = "half-open"
                self.failure_count = 0
            else:
                raise Exception("Circuit breaker is open")
        
        try:
            result = func()
            if self.state == "half-open":
                self.state = "closed"
            return result
        except Exception as e:
            self.failure_count += 1
            self.last_failure_time = datetime.now()
            
            if self.failure_count >= self.failure_threshold:
                self.state = "open"
                print(f"Circuit breaker opened after {self.failure_count} failures")
            
            raise

# Usage
breaker = CircuitBreaker()
try:
    result = breaker.call(lambda: client.extract(file_path="document.pdf"))
except Exception as e:
    print(f"Failed: {e}")

Intelligent Retry Logic

class SmartRetry:
    """Intelligent retry with different strategies per error type."""
    
    def __init__(self):
        self.strategies = {
            "FILE_005": self.retry_with_backoff,       # Download failed
            "FILE_006": self.retry_with_backoff,       # Download timeout
            "PROC_002": self.retry_with_smaller_chunk, # Processing timeout
            "PROC_004": self.handle_rate_limit,        # Rate limit exceeded
            "JOB_003": self.retry_with_smaller_chunk,  # Job expired
            "GENERAL_004": self.handle_rate_limit,     # Too many requests
            "GENERAL_002": self.retry_with_backoff,    # Service unavailable
        }
    
    def execute(self, func, context=None):
        """Execute with smart retry logic."""
        
        max_attempts = 3
        
        for attempt in range(max_attempts):
            try:
                return func()
            except PulseAPIError as e:
                if attempt == max_attempts - 1:
                    raise
                
                strategy = self.strategies.get(e.code, self.retry_with_backoff)
                strategy(e, attempt, context)
    
    def retry_with_backoff(self, error, attempt, context):
        """Standard exponential backoff."""
        delay = 2 ** attempt
        print(f"Retrying after {delay}s due to {error.code}")
        time.sleep(delay)
    
    def handle_rate_limit(self, error, attempt, context):
        """Handle rate limiting with longer delay."""
        print("Rate limited. Waiting 60 seconds...")
        time.sleep(60)
    
    def retry_with_smaller_chunk(self, error, attempt, context):
        """Retry with smaller page range for timeouts."""
        if context and 'pages' in context:
            # Reduce page range
            current_pages = context['pages']
            # Logic to split page range
            print(f"Retrying with smaller page range")
        time.sleep(5)

Error Recovery Patterns

Graceful Degradation

def extract_with_fallback(file_path, preferred_mode="full"):
    """Extract with graceful degradation."""
    
    strategies = [
        # Try full extraction with schema
        lambda: client.extract(
            file_path=file_path,
            schema=complex_schema,
            extract_figure=True
        ),
        # Fallback to simple extraction
        lambda: client.extract(
            file_path=file_path,
            schema=simple_schema
        ),
        # Last resort: text only
        lambda: client.extract(
            file_path=file_path
        )
    ]
    
    for i, strategy in enumerate(strategies):
        try:
            print(f"Attempting strategy {i + 1}/{len(strategies)}")
            return strategy()
        except PulseAPIError as e:
            if i == len(strategies) - 1:
                raise
            print(f"Strategy {i + 1} failed: {e.code}, trying next...")

Partial Success Handling

def process_large_document_with_recovery(file_path, total_pages=100):
    """Process document in chunks with partial success."""
    
    chunk_size = 10
    results = []
    failed_chunks = []
    
    for start in range(0, total_pages, chunk_size):
        end = min(start + chunk_size - 1, total_pages - 1)
        page_range = f"{start + 1}-{end + 1}"
        
        try:
            result = client.extract(
                file_path=file_path,
                pages=page_range
            )
            results.append({
                "pages": page_range,
                "content": result
            })
        except PulseAPIError as e:
            print(f"Failed to process pages {page_range}: {e}")
            failed_chunks.append(page_range)
    
    # Retry failed chunks with different strategy
    for chunk in failed_chunks:
        try:
            # Try with smaller chunks or different parameters
            result = client.extract(
                file_path=file_path,
                pages=chunk
            )
            results.append({
                "pages": chunk,
                "content": result,
                "recovered": True
            })
        except:
            print(f"Permanently failed: {chunk}")
    
    return results

Best Practices

  • Never assume API calls will succeed
  • Catch and handle specific error codes
  • Provide meaningful error messages to users
  • Log errors for debugging
  • Use exponential backoff for transient errors
  • Set reasonable retry limits
  • Only retry retryable errors
  • Add jitter to prevent thundering herd
  • Track error frequencies
  • Alert on error spikes
  • Analyze patterns for optimization
  • Review logs regularly
  • Have fallback strategies
  • Accept partial success
  • Inform users of degraded functionality
  • Queue for later retry if appropriate

Common Error Scenarios

Scenario 1: File Upload Issues

def upload_with_validation(file_path):
    """Upload file with pre-validation."""
    
    # Check file extension
    valid_extensions = ['.pdf', '.jpg', '.jpeg', '.png', '.docx', '.pptx', '.xlsx', '.html']
    file_ext = os.path.splitext(file_path)[1].lower()
    
    if file_ext not in valid_extensions:
        raise ValueError(f"Unsupported file type: {file_ext}")
    
    # Attempt upload with retry
    return exponential_backoff_retry(
        lambda: client.upload_file(file_path)
    )

Scenario 2: Async Job Management

def manage_async_job(job_id):
    """Robustly manage async job lifecycle."""
    
    max_poll_time = 600  # 10 minutes
    poll_interval = 5
    start_time = time.time()
    
    while time.time() - start_time < max_poll_time:
        try:
            status = client.get_job_status(job_id)
            
            if status['status'] == 'completed':
                return status['result']
            elif status['status'] == 'failed':
                raise PulseAPIError(
                    "JOB_004",
                    f"Job failed: {status.get('error', 'Unknown error')}"
                )
            elif status['status'] == 'cancelled':
                raise PulseAPIError("JOB_002", "Job was cancelled")
            
            time.sleep(poll_interval)
            
        except PulseAPIError as e:
            if e.code == "JOB_001":
                # Job not found - might be eventual consistency issue
                time.sleep(10)
                continue
            raise
        except requests.exceptions.RequestException:
            # Network error - retry
            time.sleep(poll_interval)
            continue
    
    # Timeout - attempt to cancel
    try:
        client.cancel_job(job_id)
    except:
        pass
    
    raise TimeoutError(f"Job {job_id} timed out after {max_poll_time}s")

Next Steps

API Reference

See endpoint details