-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Load jaffle-shop Parquet files from cloud storage into Snowflake.
API:
from data import load_from_gcs, load_from_s3
results = load_from_gcs(session, schema_name="RAW")
results = load_from_s3(session, bucket="s3://bucket/path/", schema_name="RAW")Files:
data/ingestion.pydata/sql/ingestion/*.sql
Behavior:
- GCS: Download → internal stage → COPY INTO
- S3: External stage → COPY INTO
- Schema inferred from Parquet (INFER_SCHEMA)
- Idempotent (safe to re-run)
SQL Templates:
| File | Purpose |
|---|---|
create_parquet_file_format.sql |
Parquet format definition |
create_internal_stage.sql |
Internal stage for GCS downloads |
create_stage_s3_public.sql |
External S3 stage |
create_table_from_parquet.sql |
Table creation with INFER_SCHEMA |
copy_into_table.sql |
COPY INTO with MATCH_BY_COLUMN_NAME |
Reactions are currently unavailable