Skip to content

Core, Data: Validate equality delete field IDs#17028

Open
u70b3 wants to merge 1 commit into
apache:mainfrom
u70b3:equality-delete-field-ids-validation
Open

Core, Data: Validate equality delete field IDs#17028
u70b3 wants to merge 1 commit into
apache:mainfrom
u70b3:equality-delete-field-ids-validation

Conversation

@u70b3

@u70b3 u70b3 commented Jul 1, 2026

Copy link
Copy Markdown

Summary

This PR brings stricter equality delete field ID validation into the Java SDK, aligned with the behavior added in iceberg-rust#2723.

The validation now rejects:

  • Null or empty equality field ID lists
  • Duplicate equality field IDs
  • Equality field IDs that do not exist in the equality delete row schema

Changes

  • Centralized validation in DeleteSchemaUtil.validateEqualityFieldIds.
  • Applied validation to direct Avro, Parquet, ORC equality delete writer builders.
  • Applied validation to the registry-based writer path through FileWriterBuilderImpl.
  • Updated data writer/appender factories to validate against the equality delete row schema, not the data schema.
  • Added regression coverage for projected schemas, null data schema equality-delete-only construction, empty IDs, duplicate IDs, and missing IDs.

Testing

  • ./gradlew :iceberg-core:test --tests org.apache.iceberg.avro.TestAvroDeleteWriters --tests org.apache.iceberg.formats.TestFormatModelRegistry :iceberg-data:test --tests org.apache.iceberg.data.TestGenericFileWriterFactory --tests org.apache.iceberg.TestGenericAppenderFactory :iceberg-parquet:test --tests org.apache.iceberg.parquet.TestParquetDeleteWriters :iceberg-orc:test --tests org.apache.iceberg.orc.TestOrcDeleteWriters :iceberg-core:spotlessCheck :iceberg-data:spotlessCheck :iceberg-parquet:spotlessCheck :iceberg-orc:spotlessCheck

AI Disclosure

  • Model: GPT-5
  • Platform/Tool: Codex
  • Human Oversight: partially reviewed
  • Prompt Summary: Fix equality delete field ID validation findings, compare with the original PR, and update the existing PR branch as a single commit.

@u70b3 u70b3 force-pushed the equality-delete-field-ids-validation branch from 1828e63 to 764f4e4 Compare July 1, 2026 07:23
@u70b3 u70b3 changed the title Core, Data: Validate equality delete field IDs Core, Data: Add validation for equality delete field IDs Jul 1, 2026
@u70b3 u70b3 force-pushed the equality-delete-field-ids-validation branch from 764f4e4 to 7d753ef Compare July 1, 2026 07:27
Validate equality delete field IDs against equality delete row schemas across factory and direct writer paths.

Generated-by: Codex
@u70b3 u70b3 force-pushed the equality-delete-field-ids-validation branch from 7d753ef to a4666c1 Compare July 2, 2026 03:51
@u70b3 u70b3 changed the title Core, Data: Add validation for equality delete field IDs Core, Data: Validate equality delete field IDs Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant