Skip to content

ST_Read fails on non-UTF-8 Shapefile (macOS only?) #744

@yutannihilation

Description

@yutannihilation

I found ENCODING= GDAL option doesn't work. I don't remember if this used to work on macOS, but at least this works on Windows. So, maybe something is wrong with macOS build?

FROM ST_Read('/path/to/data.shp', open_options = ["ENCODING=CP932"])

The above doesn't fail and silently returns the results like below.

┌────────────┬────────────┬────────────┐
│ A22_000001 │ A22_000002 │ A22_000003 │
│  varchar   │  varchar   │  varchar   │
├────────────┼────────────┼────────────┤
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
│ ????INVA…  │ ????INVA…  │ ????INVA…  │
├────────────┴────────────┴────────────┤
│ 10 rows                    3 columns │
└──────────────────────────────────────┘

Also, COPY TO fails if we specify the option.

COPY (
  SELECT ST_Point(1, 1), 'あああ' as text
) TO 'tmp'
WITH (FORMAT GDAL, DRIVER 'ESRI Shapefile', LAYER_CREATION_OPTIONS 'ENCODING=CP932');
IO Error:
GDAL Error (1): Failed to create field name 'text': cannot convert to CP932

Note that, COPY TO's case isn't serious as we rarely output non-UTF-8 files. However, I really wish ST_Read works because, unfortunately, there are a lot of non-UTF-8 geospatial data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions