Event-Driven Data Ingestion: Architecting S3 to Snowflake with Snowpipe

The era of the “nightly batch” is over. Modern businesses demand data as it happens. Yet, many data teams are still stuck writing Cron jobs to poll API endpoints or check S3 buckets.

There is a better way. By leveraging AWS S3 Events, SQS, and Snowflake Snowpipe, we can build a pipeline that ingests data the millisecond it lands.

The Architecture

Source: A file (JSON/Parquet/CSV) lands in an AWS S3 Bucket.
Trigger: S3 publishes an ObjectCreated event.
Queue: An Amazon SQS queue captures this event.
Ingest: Snowpipe polls the queue, sees the new file, and loads it into a Snowflake raw table.

This architecture is Serverless, Scalable (handles 1 file or 1 million), and Cheap (you pay only for compute used).

Setting it Up

1. The Storage Integration

First, Snowflake needs permission to read your S3 bucket. We create a STORAGE INTEGRATION.

CREATE STORAGE INTEGRATION s3_int
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = S3
  ENABLED = TRUE
  STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::123456789012:role/my_snowflake_role'
  STORAGE_ALLOWED_LOCATIONS = ('s3://my-raw-data-bucket/');

2. The Pipe

Instead of a COPY INTO command run by a scheduler, we wrap it in a PIPE.

CREATE PIPE my_db.raw.daily_sales_pipe
  AUTO_INGEST = TRUE
  AS
  COPY INTO my_db.raw.sales_table
  FROM @my_s3_stage
  FILE_FORMAT = (TYPE = 'JSON');

Why this changes everything

Once this pipe is active, engineering efforts shift. You no longer debug “why the 2 AM job failed.” You assume data is always arriving.

Your focus moves downstream: transforming that raw data into insights.

Need help moving to real-time? Contact our engineering team to audit your current pipelines.

Is your data stack slowing you down? The wrong architecture can cost thousands in wasted compute and engineering hours. Book a Data Architecture Assessment.

Event-Driven Data Ingestion: Architecting S3 to Snowflake with Snowpipe

The Architecture

Setting it Up

1. The Storage Integration

2. The Pipe

Why this changes everything

Related Posts

The Definitive Technical Guide to Modern Data Architecture (2026 Edition)

Orchestrating Complex Logic with dbt and Snowflake Tasks

Making Pipelines Unbreakable: The Power of Idempotency

Airflow vs Prefect vs Dagster (and where dbt & Fivetran fit in 2026)