Skip to main content

Bring Your Own Storage: Google Cloud Storage

Store your extraction artifacts in your own Google Cloud Storage bucket for complete data sovereignty and compliance requirements.
Custom storage is available for Enterprise customers. Email support@trypulse.ai to enable this feature.

Overview

Pulse connects to your Google Cloud Storage bucket using a service account with a JSON key. This provides:
  • Fine-grained access control - Grant only the permissions Pulse needs
  • Full audit trail - All access is logged in Cloud Audit Logs
  • Data residency - Choose your bucket location for compliance

Setup Steps

Step 1: Create a GCS Bucket

Create a new bucket in your GCP project:
gcloud storage buckets create gs://my-company-pulse-data \
  --location=us-central1 \
  --uniform-bucket-level-access
Recommended settings:
  • Enable uniform bucket-level access
  • Choose a location that meets your compliance requirements
  • Consider using dual-region or multi-region for high availability

Step 2: Create a Service Account

Create a dedicated service account for Pulse:
gcloud iam service-accounts create pulse-storage-access \
  --display-name="Pulse Storage Access" \
  --description="Service account for Pulse document extraction storage"

Step 3: Grant Bucket Permissions

Grant the service account access to your bucket:
gcloud storage buckets add-iam-policy-binding gs://my-company-pulse-data \
  --member="serviceAccount:pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"
Required role: roles/storage.objectAdmin allows Pulse to:
  • Create, read, update, and delete objects
  • List objects in the bucket

Step 4: Create Service Account Key

Generate a JSON key for the service account:
gcloud iam service-accounts keys create pulse-storage-key.json \
  --iam-account=pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com
Store this key securely. It provides access to your bucket.
The key file looks like:
{
  "type": "service_account",
  "project_id": "your-project-id",
  "private_key_id": "...",
  "private_key": "-----BEGIN PRIVATE KEY-----\n...",
  "client_email": "pulse-storage-access@your-project.iam.gserviceaccount.com",
  "client_id": "...",
  ...
}

Step 5: Configure in Pulse Platform

  1. Navigate to Settings > Storage in the Pulse Platform
  2. Select Google Cloud Storage as your storage provider
  3. Enter the following details:
    • Project ID: your-project-id
    • Bucket Name: my-company-pulse-data
    • Service Account JSON: Paste the contents of your key file
    • Region (optional): us-central1
    • Base Path (optional): extractions/ - prefix for all stored objects
  4. Click Save Configuration

Step 6: Test the Connection

Click Test Connection to verify that Pulse can access your bucket. The test will:
  1. Authenticate using the service account
  2. Verify bucket access
  3. Report success or any configuration issues

Step 7: Enable Custom Storage

Once the connection test passes, toggle Enabled to start using your custom storage.

Storage Structure

Pulse organizes artifacts in your bucket using this structure:
{base_path}/orgs/{org_id}/extractions/{job_id}/artifacts/
├── result.json          # Extraction results
├── original_file.pdf    # Original uploaded document
└── ...                  # Other artifacts

Security Best Practices

Create a separate bucket specifically for Pulse extractions rather than using an existing bucket with other data.
Protect against accidental deletion:
gcloud storage buckets update gs://my-company-pulse-data --versioning
Configure retention policies for compliance:
gcloud storage buckets update gs://my-company-pulse-data \
  --retention-period=30d
Ensure Cloud Audit Logs are enabled for your project to track all data access.
For enhanced security, configure VPC Service Controls to restrict data exfiltration.
Regularly rotate service account keys. GCP recommends rotation every 90 days.
# Create new key
gcloud iam service-accounts keys create new-key.json \
  --iam-account=pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com

# Update in Pulse Platform, then delete old key
gcloud iam service-accounts keys delete OLD_KEY_ID \
  --iam-account=pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com

Troubleshooting

  • Verify the service account has roles/storage.objectAdmin on the bucket
  • Check that the service account email matches the key file
  • Ensure uniform bucket-level access is enabled
  • Ensure you’ve copied the entire JSON key file content
  • Check for any trailing whitespace or formatting issues
  • Verify the key file is valid JSON
  • Verify the bucket name is spelled correctly
  • Ensure the bucket exists in the specified project
  • Check that the project ID matches
  • The service account key may have been disabled or deleted
  • Generate a new key and update the configuration

Reverting to Pulse Default Storage

To stop using custom storage and revert to Pulse’s managed storage:
  1. Toggle Enabled to off
  2. Click Reset to Default if you want to remove the configuration entirely
Reverting does not migrate existing artifacts. Data in your GCS bucket remains there.

Cleanup

To revoke Pulse access after reverting:
# Remove bucket IAM binding
gcloud storage buckets remove-iam-policy-binding gs://my-company-pulse-data \
  --member="serviceAccount:pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com" \
  --role="roles/storage.objectAdmin"

# Delete service account (optional)
gcloud iam service-accounts delete pulse-storage-access@YOUR_PROJECT.iam.gserviceaccount.com