Skip to content

Commit

Permalink
[Iceberg] deprecate some table property names
Browse files Browse the repository at this point in the history
This commit introduces the deprecation of a few tables property names
in favor of the iceberg library's naming scheme. It is to help reduce
confusion for users as to which table properties in Iceberg documentation
map to presto.

Users will get a warning on any queries which set table properties such
as CREATE TABLE and ALTER TABLE .. SET PROPERTIES statements.

Refer to IcebergTableProperties.java or iceberg.rst for the list of
deprecated names and their corresponding mapping.
  • Loading branch information
ZacBlanco committed Feb 19, 2025
1 parent d7c0930 commit c3e534e
Show file tree
Hide file tree
Showing 15 changed files with 397 additions and 138 deletions.
86 changes: 57 additions & 29 deletions presto-docs/src/main/sphinx/connector/iceberg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -357,42 +357,46 @@ connector using a WITH clause:

The following table properties are available, which are specific to the Presto Iceberg connector:

======================================= =============================================================== =========================
Property Name Description Default
======================================= =============================================================== =========================
``format`` Optionally specifies the format of table data files, ``PARQUET``
either ``PARQUET`` or ``ORC``.
======================================================== =============================================================== =========================
Property Name Description Default
======================================================== =============================================================== =========================
``commit.retry.num-retries`` Determines the number of attempts for committing the metadata ``4``
in case of concurrent upsert requests, before failing.

``partitioning`` Optionally specifies table partitioning. If a table
is partitioned by columns ``c1`` and ``c2``, the partitioning
property is ``partitioning = ARRAY['c1', 'c2']``.
``format-version`` Optionally specifies the format version of the Iceberg ``2``
specification to use for new tables, either ``1`` or ``2``.

``location`` Optionally specifies the file system location URI for
the table.
``location`` Optionally specifies the file system location URI for
the table.

``format_version`` Optionally specifies the format version of the Iceberg ``2``
specification to use for new tables, either ``1`` or ``2``.
``partitioning`` Optionally specifies table partitioning. If a table
is partitioned by columns ``c1`` and ``c2``, the partitioning
property is ``partitioning = ARRAY['c1', 'c2']``.

``commit_retries`` Determines the number of attempts for committing the metadata ``4``
in case of concurrent upsert requests, before failing.
``read.split.target-size`` The target size for an individual split when generating splits ``134217728`` (128MB)
for a table scan. Generated splits may still be larger or
smaller than this value. Must be specified in bytes.

``delete_mode`` Optionally specifies the write delete mode of the Iceberg ``merge-on-read``
specification to use for new tables, either ``copy-on-write``
or ``merge-on-read``.
``write.delete.mode`` Optionally specifies the write delete mode of the Iceberg ``merge-on-read``
specification to use for new tables, either ``copy-on-write``
or ``merge-on-read``.

``metadata_previous_versions_max`` Optionally specifies the max number of old metadata files to ``100``
keep in current metadata log.
``write.format.default`` Optionally specifies the format of table data files, ``PARQUET``
either ``PARQUET`` or ``ORC``.

``metadata_delete_after_commit`` Set to ``true`` to delete the oldest metadata file after ``false``
each commit.
``write.metadata.previous-versions-max`` Optionally specifies the max number of old metadata files to ``100``
keep in current metadata log.

``metrics_max_inferred_column`` Optionally specifies the maximum number of columns for which ``100``
metrics are collected.
``write.metadata.delete-after-commit.enabled`` Set to ``true`` to delete the oldest metadata file after ``false``
each commit.

``read.split.target-size`` The target size for an individual split when generating splits ``134217728`` (128MB)
for a table scan. Generated splits may still be larger or
smaller than this value. Must be specified in bytes.
======================================= =============================================================== =========================
``write.metadata.metrics.max-inferred-column-defaults`` Optionally specifies the maximum number of columns for which ``100``
metrics are collected.

``write.update.mode`` Optionally specifies the write delete mode of the Iceberg ``merge-on-read``
specification to use for new tables, either ``copy-on-write``
or ``merge-on-read``.
======================================================== =============================================================== =========================

The table definition below specifies format ``ORC``, partitioning by columns ``c1`` and ``c2``,
and a file system location of ``s3://test_bucket/test_schema/test_table``:
Expand All @@ -410,6 +414,26 @@ and a file system location of ``s3://test_bucket/test_schema/test_table``:
location = 's3://test_bucket/test_schema/test_table')
)

Deprecated Table Properties
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Some table properties have been deprecated or removed. The following table lists the deprecated
properties and their replacements. Update queries to use the new property names as soon as
possible. They will be removed in a future version.

======================================= ===============================================================
Deprecated Property Name New Property Name
======================================= ===============================================================
``format`` ``write.format.default``
``format_version`` ``format-version``
``commit_retries`` ``commit.retry.num-retries``
``delete_mode`` ``write.delete.mode``
``metadata_previous_versions_max`` ``write.metadata.previous-versions-max``
``metadata_delete_after_commit`` ``write.metadata.delete-after-commit.enabled``
``metrics_max_inferred_column`` ``write.metadata.metrics.max-inferred-column-defaults``
======================================= ===============================================================


Session Properties
------------------

Expand Down Expand Up @@ -754,9 +778,13 @@ already exists but does not known by the catalog.

The following arguments are available:


===================== ========== =============== =======================================================================

Argument Name required type Description

===================== ========== =============== =======================================================================

``schema`` ✔️ string Schema of the table to register

``table_name`` ✔️ string Name of the table to register
Expand Down Expand Up @@ -1545,8 +1573,8 @@ identified by unique snapshot IDs. The snapshot IDs are stored in the ``$snapsho
metadata table. You can rollback the state of a table to a previous snapshot ID.
It also supports time travel query using SYSTEM_VERSION (VERSION) and SYSTEM_TIME (TIMESTAMP) options.

Example Queries
^^^^^^^^^^^^^^^
Example Time Travel Queries
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Similar to the example queries in `SCHEMA EVOLUTION`_, create an Iceberg
table named `ctas_nation` from the TPCH `nation` table:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -135,14 +135,7 @@
import static com.facebook.presto.iceberg.IcebergPartitionType.ALL;
import static com.facebook.presto.iceberg.IcebergSessionProperties.getCompressionCodec;
import static com.facebook.presto.iceberg.IcebergSessionProperties.isPushdownFilterEnabled;
import static com.facebook.presto.iceberg.IcebergTableProperties.COMMIT_RETRIES;
import static com.facebook.presto.iceberg.IcebergTableProperties.DELETE_MODE;
import static com.facebook.presto.iceberg.IcebergTableProperties.FILE_FORMAT_PROPERTY;
import static com.facebook.presto.iceberg.IcebergTableProperties.FORMAT_VERSION;
import static com.facebook.presto.iceberg.IcebergTableProperties.LOCATION_PROPERTY;
import static com.facebook.presto.iceberg.IcebergTableProperties.METADATA_DELETE_AFTER_COMMIT;
import static com.facebook.presto.iceberg.IcebergTableProperties.METADATA_PREVIOUS_VERSIONS_MAX;
import static com.facebook.presto.iceberg.IcebergTableProperties.METRICS_MAX_INFERRED_COLUMN;
import static com.facebook.presto.iceberg.IcebergTableProperties.PARTITIONING_PROPERTY;
import static com.facebook.presto.iceberg.IcebergTableProperties.SORTED_BY_PROPERTY;
import static com.facebook.presto.iceberg.IcebergTableType.CHANGELOG;
Expand All @@ -166,6 +159,8 @@
import static com.facebook.presto.iceberg.IcebergUtil.tryGetProperties;
import static com.facebook.presto.iceberg.IcebergUtil.tryGetSchema;
import static com.facebook.presto.iceberg.IcebergUtil.validateTableMode;
import static com.facebook.presto.iceberg.IcebergWarningCode.SORT_COLUMN_TRANSFORM_NOT_SUPPORTED_WARNING;
import static com.facebook.presto.iceberg.IcebergWarningCode.USE_OF_DEPRECATED_TABLE_PROPERTY;
import static com.facebook.presto.iceberg.PartitionFields.getPartitionColumnName;
import static com.facebook.presto.iceberg.PartitionFields.getTransformTerm;
import static com.facebook.presto.iceberg.PartitionFields.toPartitionFields;
Expand All @@ -185,7 +180,6 @@
import static com.facebook.presto.iceberg.util.StatisticsUtil.calculateBaseTableStatistics;
import static com.facebook.presto.iceberg.util.StatisticsUtil.calculateStatisticsConsideringLayout;
import static com.facebook.presto.spi.StandardErrorCode.NOT_SUPPORTED;
import static com.facebook.presto.spi.StandardWarningCode.SORT_COLUMN_TRANSFORM_NOT_SUPPORTED_WARNING;
import static com.facebook.presto.spi.statistics.TableStatisticType.ROW_COUNT;
import static com.google.common.base.Verify.verify;
import static com.google.common.collect.ImmutableList.toImmutableList;
Expand All @@ -201,8 +195,6 @@
import static org.apache.iceberg.SnapshotSummary.REMOVED_POS_DELETES_PROP;
import static org.apache.iceberg.TableProperties.DELETE_ISOLATION_LEVEL;
import static org.apache.iceberg.TableProperties.DELETE_ISOLATION_LEVEL_DEFAULT;
import static org.apache.iceberg.TableProperties.SPLIT_SIZE;
import static org.apache.iceberg.TableProperties.UPDATE_MODE;

public abstract class IcebergAbstractMetadata
implements ConnectorMetadata
Expand All @@ -217,6 +209,7 @@ public abstract class IcebergAbstractMetadata
protected final FilterStatsCalculatorService filterStatsCalculatorService;
protected Transaction transaction;
protected final StatisticsFileCache statisticsFileCache;
protected final IcebergTableProperties tableProperties;

private final StandardFunctionResolution functionResolution;
private final ConcurrentMap<SchemaTableName, Table> icebergTables = new ConcurrentHashMap<>();
Expand All @@ -228,7 +221,8 @@ public IcebergAbstractMetadata(
JsonCodec<CommitTaskData> commitTaskCodec,
NodeVersion nodeVersion,
FilterStatsCalculatorService filterStatsCalculatorService,
StatisticsFileCache statisticsFileCache)
StatisticsFileCache statisticsFileCache,
IcebergTableProperties tableProperties)
{
this.typeManager = requireNonNull(typeManager, "typeManager is null");
this.commitTaskCodec = requireNonNull(commitTaskCodec, "commitTaskCodec is null");
Expand All @@ -237,6 +231,7 @@ public IcebergAbstractMetadata(
this.nodeVersion = requireNonNull(nodeVersion, "nodeVersion is null");
this.filterStatsCalculatorService = requireNonNull(filterStatsCalculatorService, "filterStatsCalculatorService is null");
this.statisticsFileCache = requireNonNull(statisticsFileCache, "statisticsFileCache is null");
this.tableProperties = requireNonNull(tableProperties, "tableProperties is null");
}

protected final Table getIcebergTable(ConnectorSession session, SchemaTableName schemaTableName)
Expand Down Expand Up @@ -702,10 +697,10 @@ private static String columnExtraInfo(List<String> partitionTransforms)
protected ImmutableMap<String, Object> createMetadataProperties(Table icebergTable, ConnectorSession session)
{
ImmutableMap.Builder<String, Object> properties = ImmutableMap.builder();
properties.put(FILE_FORMAT_PROPERTY, getFileFormat(icebergTable));
properties.put(TableProperties.DEFAULT_FILE_FORMAT, getFileFormat(icebergTable));

int formatVersion = ((BaseTable) icebergTable).operations().current().formatVersion();
properties.put(FORMAT_VERSION, String.valueOf(formatVersion));
properties.put(TableProperties.FORMAT_VERSION, String.valueOf(formatVersion));

if (!icebergTable.spec().fields().isEmpty()) {
properties.put(PARTITIONING_PROPERTY, toPartitionFields(icebergTable.spec()));
Expand All @@ -715,12 +710,12 @@ protected ImmutableMap<String, Object> createMetadataProperties(Table icebergTab
properties.put(LOCATION_PROPERTY, icebergTable.location());
}

properties.put(DELETE_MODE, IcebergUtil.getDeleteMode(icebergTable));
properties.put(UPDATE_MODE, IcebergUtil.getUpdateMode(icebergTable));
properties.put(METADATA_PREVIOUS_VERSIONS_MAX, IcebergUtil.getMetadataPreviousVersionsMax(icebergTable));
properties.put(METADATA_DELETE_AFTER_COMMIT, IcebergUtil.isMetadataDeleteAfterCommit(icebergTable));
properties.put(METRICS_MAX_INFERRED_COLUMN, IcebergUtil.getMetricsMaxInferredColumn(icebergTable));
properties.put(SPLIT_SIZE, IcebergUtil.getSplitSize(icebergTable));
properties.put(TableProperties.DELETE_MODE, IcebergUtil.getDeleteMode(icebergTable));
properties.put(TableProperties.UPDATE_MODE, IcebergUtil.getUpdateMode(icebergTable));
properties.put(TableProperties.METADATA_PREVIOUS_VERSIONS_MAX, IcebergUtil.getMetadataPreviousVersionsMax(icebergTable));
properties.put(TableProperties.METADATA_DELETE_AFTER_COMMIT_ENABLED, IcebergUtil.isMetadataDeleteAfterCommit(icebergTable));
properties.put(TableProperties.METRICS_MAX_INFERRED_COLUMN_DEFAULTS, IcebergUtil.getMetricsMaxInferredColumn(icebergTable));
properties.put(TableProperties.SPLIT_SIZE, IcebergUtil.getSplitSize(icebergTable));

SortOrder sortOrder = icebergTable.sortOrder();
// TODO: Support sort column transforms (https://github.com/prestodb/presto/issues/24250)
Expand Down Expand Up @@ -1125,22 +1120,36 @@ public void setTableProperties(ConnectorSession session, ConnectorTableHandle ta

UpdateProperties updateProperties = transaction.updateProperties();
for (Map.Entry<String, Object> entry : properties.entrySet()) {
switch (entry.getKey()) {
case COMMIT_RETRIES:
updateProperties.set(TableProperties.COMMIT_NUM_RETRIES, String.valueOf(entry.getValue()));
break;
case SPLIT_SIZE:
updateProperties.set(TableProperties.SPLIT_SIZE, entry.getValue().toString());
break;
default:
throw new PrestoException(NOT_SUPPORTED, "Updating property " + entry.getKey() + " is not supported currently");
if (!tableProperties.getUpdatableProperties()
.contains(entry.getKey())) {
throw new PrestoException(NOT_SUPPORTED, "Updating property " + entry.getKey() + " is not supported currently");
}
String propertyName = entry.getKey();
if (tableProperties.getDeprecatedProperties().containsKey(entry.getKey())) {
String newPropertyKey = tableProperties.getDeprecatedProperties().get(entry.getKey());
PrestoWarning warning = getPrestoWarning(newPropertyKey, propertyName);
session.getWarningCollector().add(warning);
propertyName = newPropertyKey;
}
updateProperties.set(propertyName, String.valueOf(entry.getValue()));
}

updateProperties.commit();
transaction.commitTransaction();
}

private static PrestoWarning getPrestoWarning(String newPropertyKey, String propertyName)
{
PrestoWarning warning;
if (newPropertyKey == null) {
warning = new PrestoWarning(USE_OF_DEPRECATED_TABLE_PROPERTY, format("Property \"%s\" is deprecated and will be completely removed in a future version. Avoid using immediately.", propertyName));
}
else {
warning = new PrestoWarning(USE_OF_DEPRECATED_TABLE_PROPERTY, format("Property \"%s\" has been renamed to \"%s\". This will become an error in future versions.", propertyName, newPropertyKey));
}
return warning;
}

/**
* Deletes all the files for a specific predicate
*
Expand Down
Loading

0 comments on commit c3e534e

Please sign in to comment.