> ## Documentation Index > Fetch the complete documentation index at: https://docs.runpulse.com/llms.txt > Use this file to discover all available pages before exploring further. # S3 Storage Pipeline > Process documents from S3 and keep extraction artifacts in your storage boundary. ## Goal Run a document ingestion workflow where source files and extraction results live in your cloud storage path instead of being copied through one-off local scripts. ## Use This Workflow ```mermaid theme={null} flowchart LR A["S3 source prefix"] --> B["Pulse batch extract"] B --> C["S3 results prefix"] C --> D["Your database or review app"] ``` Use this pattern for regulated intake queues, nightly backfills, and customer-controlled storage workflows. ## Batch Extract From S3 ```python theme={null} from pulse import Pulse from pulse.types.batch_input_source import BatchInputSource from pulse.types.batch_output_destination import BatchOutputDestination client = Pulse(api_key="YOUR_API_KEY") job = client.batch.extract( input=BatchInputSource(s_3_prefix="s3://customer-docs/intake/"), output=BatchOutputDestination(s_3_prefix="s3://customer-docs/pulse-results/"), extract_options={ "extensions": { "chunking": { "chunk_types": ["page"], "chunk_size": 1200 } }, "storage": {"enabled": True} }, workers=8, ) print(job.batch_job_id) ``` ## Single File With A Presigned URL ```python theme={null} from pulse import Pulse client = Pulse(api_key="YOUR_API_KEY") result = client.extract( file_url="https://customer-bucket.s3.amazonaws.com/path/file.pdf?X-Amz-Signature=...", async_=True, storage={"enabled": True}, ) print(result.job_id) ``` ## Checks * Use short-lived presigned URLs for single-file extraction. * Use batch S3 prefixes for high-volume queues. * Enable Bring Your Own Storage when artifacts must stay in your cloud account. * Persist `batch_job_id`, child job IDs, source object keys, and output prefixes. * Make downstream writes idempotent by source object key and extraction ID. ## Related Configure Pulse access to your bucket. Full batch endpoint reference. Receive completion events.