Skip to content

Site: Update production configuration page #1606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 19, 2025

Conversation

flyrain
Copy link
Contributor

@flyrain flyrain commented May 18, 2025

This PR simplifies the title and adding a check list, so that readers can easily understand the items required in production configuration
Before:
Screenshot 2025-05-18 at 4 57 51 PM
After:
Screenshot 2025-05-18 at 4 56 21 PM

@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board May 18, 2025
@flyrain flyrain changed the title Update prodcution page Site: Update production configuration page May 18, 2025
@flyrain flyrain marked this pull request as ready for review May 19, 2025 00:01
Copy link
Contributor

@pingtimeout pingtimeout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One point that needs confirmed by JB (not a big deal) and one that definitely needs changed (the FILE section)

type. This should be disabled for production systems.
- Use this configuration to additionally disable any other storage types that will not be in use.
### Disable FILE Storage Type
By default, Polaris allows using the local file system (`FILE`) for catalog storage. This is fine for testing,
Copy link
Contributor

@pingtimeout pingtimeout May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, given that #1566 has been merged, this sentence is false, isn't it?

EDIT: my bad, 1566 has not been merged yet. Let's wait for a couple of hours/days until either of #1532 and #1566 is merged. Both these PRs will remove the need for this doc section, and they are needed for 1.0.

- [ ] Enforce realm header validation (`require-header=true`)
- [ ] Use a durable metastore (JDBC + PostgreSQL)
- [ ] Bootstrap valid realms in the metastore
- [ ] Disable local FILE storage
Copy link
Contributor

@pingtimeout pingtimeout May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that #1566 has been merged, that bullet point should not be necessary. Am I missing something?

Edit: my bad, 1566 has not been merged yet.

@flyrain flyrain dismissed pingtimeout’s stale review May 19, 2025 17:53

All comments are resolved.

@flyrain flyrain merged commit 0983911 into apache:main May 19, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from PRs In Progress to Done in Basic Kanban Board May 19, 2025
@flyrain
Copy link
Contributor Author

flyrain commented May 19, 2025

Thanks @pingtimeout and @jbonofre for the review.

snazy added a commit to snazy/polaris that referenced this pull request May 22, 2025
* main: Update docker.io/prom/prometheus Docker tag to v3.4.0 (apache#1602)

* Site: Update production configuration page (apache#1606)

* main: Update dependency com.google.cloud:google-cloud-storage-bom to v2.52.3 (apache#1623)

* main: Update dependency boto3 to v1.38.19 (apache#1622)

* Remove Bouncy Castle dependency usage from PemUtils (apache#1318)

- Added PEM format parsing in PemUtils
- Added unit test for PemUtils for empty file and multiple PEM objects
- Removed Bouncy Castle Provider dependency from service common module
- Removed Bouncy Castle Provider dependency from quarkus service module

* Site: Add a page for policy management (apache#1600)

* [Policy Store | Management Spec] Add policy privileges to spec and update admin service impl (apache#1529)

This PR adds new policy related privileges to polaris-management-api.yml and update PolarisAdminService to allow granting new privileges

* Spec: Add SigV4 Auth Support for Catalog Federation (apache#1506)

* Spec changes for SigV4 Auth Support for Catalog Federation

* Extract service identity info as a nested object

* nit: fix admin tool log level and comments (apache#1626)

The previous WARNING log levels seems to work, but WARN
aligns better with standard Quarkus log levels.

Fixes apache#1612

* Doc: switch to use iceberg-aws-bundle jar (apache#1609)

* main: Update dependency org.mockito:mockito-core to v5.18.0 (apache#1630)

* main: Update dependency boto3 to v1.38.20 (apache#1631)

* Require explicit user-consent to enable HadoopFileIO (apache#1532)

Using `HadoopFileIO` in Polaris can enable "hidden features" that users are likely not aware of. This change requires users to manually update the configuration to be able to use `HadoopFileIO` in way that highlights the consequences of enabling it.

This PR updates Polaris in multiple ways:
* The default of `SUPPORTED_CATALOG_STORAGE_TYPES` is changed to not include the `FILE` storage type.
* Respect the `ALLOW_SPECIFYING_FILE_IO_IMPL` configuration on namespaces, tables and views to prevent setting an `io-impl` value for anything but one of the configured, supported storage-types.
* Unify validation code in a new class `IcebergPropertiesValidation`.
* Using `FILE` or `HadoopFileIO` now _also_ requires the explicit configuration `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS=true`.
* Added production readiness checks that trigger when `ALLOW_INSECURE_STORAGE_TYPES_ACCEPTING_SECURITY_RISKS` is `true` or `SUPPORTED_CATALOG_STORAGE_TYPES` contains `FILE` (defaults and per-realm).
* The two new readiness checks are considered _severe_. Severe readiness-errors prevent the server from starting up - unless the user explicitly configured `polaris.readiness.ignore-security-issues=true`.

Log messages and configuration options explicitly use "clear" phrases highlighting the consequences.

With these changes it is intentionally extremely difficult to start Polaris with HadoopFileIO. People who work around all these safety nets must have realized that what they are doing.

A lot of the test code relies on `FILE`/`HadoopFileIO`, those tests got all the configurations to let those tests continue to work as they are, bypassing the added security safeguards.

---------

Co-authored-by: Dmitri Bourlatchkov <[email protected]>

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Yufei Gu <[email protected]>
Co-authored-by: David Handermann <[email protected]>
Co-authored-by: Honah (Jonas) J. <[email protected]>
Co-authored-by: Rulin Xing <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: MonkeyCanCode <[email protected]>
adnanhemani pushed a commit to adnanhemani/polaris that referenced this pull request May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants