Skip to content

Unable to use AWS S3 locations with quotes in path #1545

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dimas-b opened this issue May 8, 2025 · 4 comments · May be fixed by #1586
Open

Unable to use AWS S3 locations with quotes in path #1545

dimas-b opened this issue May 8, 2025 · 4 comments · May be fixed by #1586
Assignees
Labels
bug Something isn't working

Comments

@dimas-b
Copy link
Contributor

dimas-b commented May 8, 2025

Describe the bug

Polaris fails to accept S3 locations that are valid in AWS, but contain " in the path.

For example: s3://example/pol"3/

To Reproduce

  1. Start Polaris
  2. Try to create catalog:
$ ./polaris --client-id *** --client-secret *** catalogs create polaris2 \
  --storage-type S3 \
  --default-base-location 's3://example-bucket-42/pol"2/' \
  --role-arn arn:aws:iam::11111111111:role/test --external-id ***

Actual Behavior

Exception when communicating with the Polaris server. IllegalArgumentException: Illegal character in path at index 26: s3://example-bucket-42/pol"2/

Expected Behavior

No exception

Additional context

The example location is considered valid in AWS UI. It is possible to create it and copy its S3 location in AWS UI.

System information

Local Polaris Server build as of commit 04c4f91

@dimas-b dimas-b added the bug Something isn't working label May 8, 2025
@dimas-b
Copy link
Contributor Author

dimas-b commented May 8, 2025

@adnanhemani
Copy link
Collaborator

adnanhemani commented May 14, 2025

This is odd - without adding an external ID, this works as expected. But not when adding the external ID...

./polaris --client-id *** --client-secret *** catalogs create polaris2 \
  --storage-type S3 \
  --default-base-location 's3://***/pol"2/' \
  --role-arn arn:aws:iam::***:role/***

yields the following logs:

2025-05-13 21:17:10,515 INFO  [org.apa.pol.ser.adm.PolarisServiceImpl] [,POLARIS] [,,,] (executor-thread-1) Created new catalog class PolarisCatalog {
    class Catalog {
        type: INTERNAL
        name: polaris2
        properties: class CatalogProperties {
            {default-base-location=s3://***/pol"2/}
            defaultBaseLocation: s3://***/pol"2/
        }
        createTimestamp: 1747196230343
        lastUpdateTimestamp: 1747196230343
        entityVersion: 1
        storageConfigInfo: class AwsStorageConfigInfo {
            class StorageConfigInfo {
                storageType: S3
                allowedLocations: [s3://***/pol"2/]
            }
            roleArn: arn:aws:iam::***:role/***
            externalId: null
            userArn: null
            region: null
        }
    }
}
2025-05-13 21:17:10,516 INFO  [io.qua.htt.access-log] [,POLARIS] [,,,] (executor-thread-1) 127.0.0.1 - root [13/May/2025:21:17:10 -0700] "POST /api/management/v1/catalogs HTTP/1.1" 201 -

But:

./polaris --client-id *** --client-secret *** catalogs create polaris3 \
  --storage-type S3 \
  --default-base-location 's3://***/pol"3/' \
  --role-arn arn:aws:iam::***:role/*** --external-id abcd1234

yields:

2025-05-13 21:22:55,619 INFO  [org.apa.pol.ser.exc.IcebergExceptionMapper] [,POLARIS] [,,,] (executor-thread-1) Handling runtimeException Illegal character in path at index 39: s3://polaris-quickstart-s3-z90ts1o5/pol"2/
2025-05-13 21:22:55,627 INFO  [io.qua.htt.access-log] [,POLARIS] [,,,] (executor-thread-1) 127.0.0.1 - root [13/May/2025:21:22:55 -0700] "POST /api/management/v1/catalogs HTTP/1.1" 400 151

Seems from the exception that this is coming back from Iceberg? Let me investigate further.

EDIT: I saw this originally because there were no other catalogs in my DB when I was starting - will post in the next message, this exception happens when we are parsing current catalog locations. This issue is indeed replicable, just not with only the steps provided. One must add 2 catalogs with locations that have the quote character: the first call will succeed, the second will fail - given that you are starting with no catalogs that already have the quote character.

@adnanhemani
Copy link
Collaborator

adnanhemani commented May 14, 2025

The line of code that is causing this bug is that we are attempting to convert all existing existing catalogs' locations to Java URIs while checking if this new catalog location is a child of any other catalogs:

this.location = URI.create(location).toString();

Will research tomorrow if there's an easy way to modify this line to not use URIs.

@adnanhemani
Copy link
Collaborator

Opened a fix - #1586. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants