Skip to content

Unable to export big files with GDAL to S3 #715

@drzippie

Description

@drzippie

What happens?

Fails CLI (version ) and nodejs with

 IO Error:
Unknown part number

To Reproduce

 INSTALL spatial;
 LOAD spatial;
 INSTALL httpfs;
 LOAD httpfs;

 DROP SECRET IF EXISTS s3_secret;
 CREATE SECRET s3_secret (
   TYPE S3,
   KEY_ID '***',
   SECRET '**',
   SESSION_TOKEN '***',
   REGION 'us-east-1'
 );

     COPY (
       SELECT * FROM read_parquet([ ... list of urls of big parquet files ...])
     )
     TO 's3://bucket-name/output/file.geojson'
     (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_CREATION_OPTIONS 'WRITE_BBOX=YES', SRS 'EPSG:4326');

Changing the FORMAT to:

     (FORMAT GDAL, DRIVER 'GeoJSON', LAYER_CREATION_OPTIONS 'WRITE_BBOX=NO', SRS 'EPSG:4326');

or

     (FORMAT GDAL, DRIVER 'GeoJSON', SRS 'EPSG:4326');

works!

OS:

aarch64

DuckDB Version:

v1.4.2 (Andium) 68d7555f68

DuckDB Client:

CLI and node

Hardware:

Mac Sillicon Max PRO 32GB

Full Name:

Antonio Cortés

Affiliation:

Carto

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

  • Yes, I have

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant data sets for reproducing the issue?

No - I cannot easily share my data sets due to their large size

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions