Documentation Index Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Proper error handling is crucial for building reliable document processing applications. This guide covers all Pulse API error codes, retry strategies, and best practices for graceful error recovery.
All errors follow a consistent JSON structure:
{
"error" : {
"code" : "FILE_001" ,
"message" : "Invalid file type" ,
"details" : {
"supported_types" : [ "PDF" , "JPG" , "PNG" , "DOCX" , "PPTX" , "XLSX" , "HTML" ],
"received_type" : "DOC"
}
}
}
Error Categories
Authentication AUTH_XXX - API key issues
Request REQ_XXX - Invalid parameters
File FILE_XXX - File format/size issues
Schema SCHEMA_XXX - Schema validation
Billing BILLING_XXX - Usage limit issues
Processing PROC_XXX - Processing failures
Job JOB_XXX - Async job issues
Storage STORAGE_XXX - Storage issues
General GENERAL_XXX - Server errors
Complete Error Code Reference
Authentication Errors (AUTH_XXX)
Code HTTP Description Solution AUTH_001401 API key is required Include x-api-key header in request AUTH_002401 Invalid API key Verify key in Platform AUTH_003401 API key expired Generate new key in Platform AUTH_004403 Insufficient permissions Check API key permissions AUTH_005401 Organization not found Verify your organization exists AUTH_006403 Access restricted to whitelisted domains Contact support for domain whitelisting
Request Errors (REQ_XXX)
Code HTTP Description Solution REQ_001400 No file or URL provided Include either file or file_url parameter REQ_002400 Invalid request body Validate JSON syntax and structure REQ_003400 Invalid parameter value Check parameter format and allowed values REQ_004400 Missing required parameter Check API documentation for requirements REQ_005400 Invalid chunk size Use chunk size between 100-10000 REQ_006400 Invalid page range Use format like 1-5 or 1,3,5
File Errors (FILE_XXX)
Code HTTP Description Solution FILE_001400 Invalid file type Use supported formats: PDF, JPG/JPEG, PNG, DOCX, PPTX, XLSX, HTML FILE_002413 File too large Maximum file size is 100MB FILE_003400 File corrupted Verify file integrity, re-save if needed FILE_004400 Empty file Ensure file has content FILE_005400 Failed to download from URL Check URL accessibility and permissions FILE_006408 Timeout downloading file Use a faster hosting service or upload directly FILE_007400 Invalid file URL Provide a valid, accessible URL
Schema Errors (SCHEMA_XXX)
Code HTTP Description Solution SCHEMA_001400 Invalid schema format Ensure schema conforms to JSON Schema spec SCHEMA_002400 Schema processing failed Simplify schema or check document compatibility SCHEMA_003400 Schema too complex Reduce nesting depth (max 5 levels) SCHEMA_004400 Unsupported schema type Use supported data types only SCHEMA_005400 Schema validation timeout Simplify schema or reduce document size
Billing Errors (BILLING_XXX)
Code HTTP Description Solution BILLING_001403 Trial expired Upgrade to a paid plan BILLING_002403 Page limit exceeded Upgrade plan or wait for monthly reset BILLING_003402 Payment required Add payment method in Console BILLING_004403 Plan limit reached Upgrade to a higher tier BILLING_005402 Billing status unknown Contact support BILLING_006403 Account suspended Resolve billing issues in Console BILLING_007402 Payment failed Update payment method BILLING_008402 Payment requires authentication Complete 3D Secure verification BILLING_009403 Subscription canceled Resubscribe to continue BILLING_010402 Subscription past due Update payment method BILLING_011403 No active subscription Subscribe to a plan BILLING_012402 Free tier limit exceeded Upgrade required to continue BILLING_013402 Storage settings restricted on free tier Upgrade to use storage features
Processing Errors (PROC_XXX)
Code HTTP Description Solution PROC_001500 Document processing failed Retry or contact support if persistent PROC_002408 Processing timeout Use async endpoint or process fewer pages PROC_003500 Service temporarily unavailable Retry after a few minutes PROC_004503 Rate limit exceeded Implement exponential backoff PROC_005500 Partial extraction failure Some elements couldn’t be processed
Job Errors (JOB_XXX)
Code HTTP Description Solution JOB_001404 Job not found Verify job ID; jobs expire after 48 hours JOB_002409 Job already cancelled Job cannot be modified JOB_003410 Job expired Resubmit the extraction request JOB_004409 Job still processing Wait and poll again JOB_005500 Job failed Check error details; retry if transient
Storage Errors (STORAGE_XXX)
Code HTTP Description Solution STORAGE_001500 Failed to store results Retry the extraction STORAGE_002404 Results not found Results may have expired STORAGE_003500 Storage service unavailable Retry after a few minutes STORAGE_004507 Storage limit exceeded Delete old extractions or upgrade STORAGE_005404 Extraction expired Reprocess the document
General Errors (GENERAL_XXX)
Code HTTP Description Solution GENERAL_001500 Internal server error Retry; contact support if persistent GENERAL_002503 Service unavailable Check status page; retry later GENERAL_003504 Gateway timeout Use async endpoint for large documents GENERAL_004429 Too many requests Implement rate limiting with backoff GENERAL_005501 Feature not implemented Feature not yet available
Handling Errors in Code
Basic Error Handling
from pulse import Pulse
from pulse.core.api_error import ApiError
client = Pulse( api_key = "YOUR_API_KEY" )
try :
response = client.extract(
file_url = "https://www.impact-bank.com/user/file/dummy_statement.pdf"
)
print ( f "Success! Job ID: { response.job_id } " )
except ApiError as e:
error_code = e.body.get( "error" , {}).get( "code" , "UNKNOWN" )
error_message = e.body.get( "error" , {}).get( "message" , "Unknown error" )
if error_code == "AUTH_002" :
print ( "Invalid API key. Check your credentials." )
elif error_code == "FILE_001" :
print ( "Unsupported file type." )
elif error_code == "FILE_002" :
print ( "File too large. Maximum is 100MB." )
elif error_code.startswith( "BILLING_" ):
print ( f "Billing issue: { error_message } " )
elif error_code.startswith( "PROC_" ):
print ( f "Processing error (retry may help): { error_message } " )
else :
print ( f "API Error { error_code } : { error_message } " )
except Exception as e:
print ( f "Unexpected error: { e } " )
Comprehensive Error Handler
class PulseAPIError ( Exception ):
"""Custom exception for Pulse API errors."""
def __init__ ( self , code , message , details = None ):
self .code = code
self .message = message
self .details = details or {}
super (). __init__ ( f " { code } : { message } " )
class ErrorHandler :
"""Centralized error handling for Pulse API."""
# Retryable error codes
RETRYABLE_CODES = {
"FILE_005" , # Download failed
"FILE_006" , # Download timeout
"PROC_001" , # Processing failed
"PROC_002" , # Processing timeout
"PROC_003" , # Service unavailable
"PROC_004" , # Rate limit
"JOB_003" , # Job expired
"STORAGE_003" , # Storage unavailable
"GENERAL_001" , # Server error
"GENERAL_002" , # Service unavailable
"GENERAL_003" , # Gateway timeout
"GENERAL_004" , # Rate limit
}
@ staticmethod
def handle_response ( response ):
"""Process API response and raise appropriate errors."""
if response.status_code == 200 :
return response.json()
# Parse error response
try :
error_data = response.json().get( "error" , {})
code = error_data.get( "code" , str (response.status_code))
message = error_data.get( "message" , "Unknown error" )
details = error_data.get( "details" , {})
except :
code = str (response.status_code)
message = response.text or "Unknown error"
details = {}
# Determine if retryable
is_retryable = code in ErrorHandler. RETRYABLE_CODES
# Create appropriate exception
error = PulseAPIError(code, message, details)
error.is_retryable = is_retryable
raise error
Retry Strategies
Exponential Backoff
import time
import random
from pulse import Pulse
from pulse.core.api_error import ApiError
client = Pulse( api_key = "YOUR_API_KEY" )
RETRYABLE_CODES = { "PROC_001" , "PROC_002" , "PROC_003" , "PROC_004" ,
"GENERAL_001" , "GENERAL_002" , "GENERAL_004" }
def extract_with_retry ( file_url , max_retries = 3 , base_delay = 1 ):
"""Extract with exponential backoff retry."""
for attempt in range (max_retries):
try :
return client.extract( file_url = file_url)
except ApiError as e:
error_code = e.body.get( "error" , {}).get( "code" , "" )
if error_code not in RETRYABLE_CODES or attempt == max_retries - 1 :
raise
delay = base_delay * ( 2 ** attempt) + random.uniform( 0 , 1 )
print ( f "Retry { attempt + 1 } / { max_retries } after { delay :.1f} s" )
time.sleep(delay)
raise Exception ( "Max retries exceeded" )
# Usage
result = extract_with_retry( "https://www.impact-bank.com/user/file/dummy_statement.pdf" )
Circuit Breaker Pattern
from datetime import datetime, timedelta
class CircuitBreaker :
"""Prevent cascading failures with circuit breaker."""
def __init__ ( self , failure_threshold = 5 , recovery_timeout = 60 ):
self .failure_threshold = failure_threshold
self .recovery_timeout = recovery_timeout
self .failure_count = 0
self .last_failure_time = None
self .state = "closed" # closed, open, half-open
def call ( self , func ):
"""Execute function with circuit breaker protection."""
if self .state == "open" :
if datetime.now() - self .last_failure_time > timedelta( seconds = self .recovery_timeout):
self .state = "half-open"
self .failure_count = 0
else :
raise Exception ( "Circuit breaker is open" )
try :
result = func()
if self .state == "half-open" :
self .state = "closed"
return result
except Exception as e:
self .failure_count += 1
self .last_failure_time = datetime.now()
if self .failure_count >= self .failure_threshold:
self .state = "open"
print ( f "Circuit breaker opened after { self .failure_count } failures" )
raise
# Usage
breaker = CircuitBreaker()
try :
result = breaker.call( lambda : client.extract( file_url = "document.pdf" ))
except Exception as e:
print ( f "Failed: { e } " )
Intelligent Retry Logic
class SmartRetry :
"""Intelligent retry with different strategies per error type."""
def __init__ ( self ):
self .strategies = {
"FILE_005" : self .retry_with_backoff, # Download failed
"FILE_006" : self .retry_with_backoff, # Download timeout
"PROC_002" : self .retry_with_smaller_chunk, # Processing timeout
"PROC_004" : self .handle_rate_limit, # Rate limit exceeded
"JOB_003" : self .retry_with_smaller_chunk, # Job expired
"GENERAL_004" : self .handle_rate_limit, # Too many requests
"GENERAL_002" : self .retry_with_backoff, # Service unavailable
}
def execute ( self , func , context = None ):
"""Execute with smart retry logic."""
max_attempts = 3
for attempt in range (max_attempts):
try :
return func()
except PulseAPIError as e:
if attempt == max_attempts - 1 :
raise
strategy = self .strategies.get(e.code, self .retry_with_backoff)
strategy(e, attempt, context)
def retry_with_backoff ( self , error , attempt , context ):
"""Standard exponential backoff."""
delay = 2 ** attempt
print ( f "Retrying after { delay } s due to { error.code } " )
time.sleep(delay)
def handle_rate_limit ( self , error , attempt , context ):
"""Handle rate limiting with longer delay."""
print ( "Rate limited. Waiting 60 seconds..." )
time.sleep( 60 )
def retry_with_smaller_chunk ( self , error , attempt , context ):
"""Retry with smaller page range for timeouts."""
if context and 'pages' in context:
# Reduce page range
current_pages = context[ 'pages' ]
# Logic to split page range
print ( f "Retrying with smaller page range" )
time.sleep( 5 )
Error Recovery Patterns
Graceful Degradation
def extract_with_fallback ( file_path , preferred_mode = "full" ):
"""Extract with graceful degradation."""
strategies = [
# Try full extraction with schema
lambda : client.extract(
file_url = file_path,
schema = complex_schema
),
# Fallback to simple extraction
lambda : client.extract(
file_url = file_path,
schema = simple_schema
),
# Last resort: text only
lambda : client.extract(
file_url = file_path
)
]
for i, strategy in enumerate (strategies):
try :
print ( f "Attempting strategy { i + 1 } / { len (strategies) } " )
return strategy()
except PulseAPIError as e:
if i == len (strategies) - 1 :
raise
print ( f "Strategy { i + 1 } failed: { e.code } , trying next..." )
Partial Success Handling
def process_large_document_with_recovery ( file_path , total_pages = 100 ):
"""Process document in chunks with partial success."""
chunk_size = 10
results = []
failed_chunks = []
for start in range ( 0 , total_pages, chunk_size):
end = min (start + chunk_size - 1 , total_pages - 1 )
page_range = f " { start + 1 } - { end + 1 } "
try :
result = client.extract(
file_url = file_path,
pages = page_range
)
results.append({
"pages" : page_range,
"content" : result
})
except PulseAPIError as e:
print ( f "Failed to process pages { page_range } : { e } " )
failed_chunks.append(page_range)
# Retry failed chunks with different strategy
for chunk in failed_chunks:
try :
# Try with smaller chunks or different parameters
result = client.extract(
file_url = file_path,
pages = chunk
)
results.append({
"pages" : chunk,
"content" : result,
"recovered" : True
})
except :
print ( f "Permanently failed: { chunk } " )
return results
Best Practices
Never assume API calls will succeed
Catch and handle specific error codes
Provide meaningful error messages to users
Log errors for debugging
Use exponential backoff for transient errors
Set reasonable retry limits
Only retry retryable errors
Add jitter to prevent thundering herd
Track error frequencies
Alert on error spikes
Analyze patterns for optimization
Review logs regularly
Have fallback strategies
Accept partial success
Inform users of degraded functionality
Queue for later retry if appropriate
Common Error Scenarios
Scenario 1: File Upload Issues
def upload_with_validation ( file_path ):
"""Upload file with pre-validation."""
# Check file extension
valid_extensions = [ '.pdf' , '.jpg' , '.jpeg' , '.png' , '.docx' , '.pptx' , '.xlsx' , '.html' ]
file_ext = os.path.splitext(file_path)[ 1 ].lower()
if file_ext not in valid_extensions:
raise ValueError ( f "Unsupported file type: { file_ext } " )
# Attempt upload with retry
return exponential_backoff_retry(
lambda : client.extract( file_url = file_url)
)
Scenario 2: Async Job Management
def manage_async_job ( job_id ):
"""Robustly manage async job lifecycle."""
max_poll_time = 600 # 10 minutes
poll_interval = 5
start_time = time.time()
while time.time() - start_time < max_poll_time:
try :
status = client.jobs.get_job( job_id = job_id)
if status[ 'status' ] == 'completed' :
return status[ 'result' ]
elif status[ 'status' ] == 'failed' :
raise PulseAPIError(
"JOB_004" ,
f "Job failed: { status.get( 'error' , 'Unknown error' ) } "
)
elif status[ 'status' ] == 'cancelled' :
raise PulseAPIError( "JOB_002" , "Job was cancelled" )
time.sleep(poll_interval)
except PulseAPIError as e:
if e.code == "JOB_001" :
# Job not found - might be eventual consistency issue
time.sleep( 10 )
continue
raise
except requests.exceptions.RequestException:
# Network error - retry
time.sleep(poll_interval)
continue
# Timeout - attempt to cancel
try :
client.jobs.cancel_job( job_id = job_id)
except :
pass
raise TimeoutError ( f "Job { job_id } timed out after { max_poll_time } s" )
Next Steps
API Reference See endpoint details