[controller][schema] Coerce legacy numeric defaults during store migration#2802
Merged
xunyin8 merged 7 commits intoJun 4, 2026
Merged
Conversation
fc5908f to
9c88232
Compare
namithanivead
previously approved these changes
May 27, 2026
…ation
Adds a destination-side rewrite ({0 -> 0.0} on float-typed fields, etc.)
so legacy schemas registered before validateNumericDefaultValueTypes was
enforced can be migrated into clusters where the controller's STRICT
parser now rejects them. Gated on storeConfig.migrationDestCluster, so
non-migration writes are unaffected; defensively re-strict-parses the
output so non-numeric violations (bad names, dangling content, union
default not first branch) still fail loudly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The existing tests for parseSchemaFromJSONLooseNumericValidation and coerceNumericDefaultsToFieldType live in venice-push-job; diff coverage on venice-client-common (where the class lives) only sees branches exercised by tests in the same module. Adds 16 focused tests covering the strict/loose/loose-numeric parsers, the parseSchemaFromJSON wrapper with both extendedSchemaValidityCheckEnabled values, and every branch of the JSON walker (each numeric tier, nested record recursion, non-textual type passthrough, identity short-circuit for clean input, IOException → VeniceException wrap). Diff coverage on the changed lines: 76.92% branches (45% required). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Assert.assertSame on String operands compiles to ==, which trips ES_COMPARING_STRINGS_WITH_EQ. Switch to Assert.assertEquals — the observable behavior (walker returns the input unchanged for clean or non-textual-type schemas) is still verified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Jackson parses any JSON decimal literal (e.g. 0.0) into a DoubleNode
regardless of the declared field type. The "float" branch of
coerceNumber short-circuits on value.isDouble(), so legacy schemas
written as {"type":"float","default":0.0} are NOT rewritten by the
walker — they pass through unchanged. Empirically avro-util1's STRICT
parser accepts DoubleNode-on-float (the numeric-tier check is
asymmetric: rejects IntNode-on-float, accepts DoubleNode-on-float), so
the output is still strict-clean.
Adds a regression-pinning test for this combination. If avro-util1 ever
tightens the float numeric-tier check to be symmetric, this test will
fail and the "float" branch will need to coerce DoubleNode -> FloatNode.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the initial strict parse fails inside normalizeSchemaForMigration, log strictFailure at INFO before attempting the LOOSE_NUMERICS-based coercion, and on the post-coercion strict re-check attach strictFailure as a suppressed exception of whatever the second parse throws. For non-numeric violations (union default not first branch, bad names, dangling content) the post-coercion strict parse still fails — and without chaining, the operator only sees that second exception and has no idea what was wrong with the source schema. The suppressed entry puts the original message into the same stack trace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9f4261b to
4008fa6
Compare
VeniceParentHelixAdmin#createStore and #addValueSchema now route value schemas through VeniceHelixAdmin#normalizeSchemaForMigration. The internal admin is a Mockito mock in TestVeniceParentHelixAdmin, so the unstubbed method returned null and blanked out the value schema, producing NPEs during admin-message serialization and strict schema parsing. Mirror production's non-migration passthrough behavior by stubbing the method to return its schema argument verbatim. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
namithanivead
approved these changes
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem Statement
Store migration fails due to strict default numeric value check. Legacy stores did not enforce this check and now they cannot be migrated.
Solution
Adds a destination-side rewrite ({0 -> 0.0} on float-typed fields, etc.) so legacy schemas registered before validateNumericDefaultValueTypes was enforced can be migrated into clusters where the controller's STRICT parser now rejects them. Gated on storeConfig.migrationDestCluster, so non-migration writes are unaffected; defensively re-strict-parses the output so non-numeric violations (bad names, dangling content, union default not first branch) still fail loudly.
Code changes
Concurrency-Specific Checks
Both reviewer and PR author to verify
synchronized,RWLock) are used where needed.ConcurrentHashMap,CopyOnWriteArrayList).How was this PR tested?
Does this PR introduce any user-facing or breaking changes?