Synchronously extract content from documents. Best for files under 50 pages. Returns markdown or HTML formatted content with optional structured data extraction.
For documents over 70 pages, results are returned via S3 URL.
"is_url": trueAPI key for authentication
URL of the file to process
JSON schema for structured data extraction
{
"invoice_number": "string",
"total": "float"
}Page range to process (e.g., "1-5", "1,3,5")
"1-10"
Custom chunk size in characters
5000
Extract figures and images
Generate AI descriptions for figures
Return HTML instead of markdown
Custom prompt for schema extraction