Skip to content

[controller][schema][test] Extract StoreSchemaService and ParentSchemaOrchestrator; fix A/A RMD registration #2842

Open
eldernewborn wants to merge 13 commits into
linkedin:mainfrom
eldernewborn:eldernewborn/refactor-schema-services
Open

[controller][schema][test] Extract StoreSchemaService and ParentSchemaOrchestrator; fix A/A RMD registration #2842
eldernewborn wants to merge 13 commits into
linkedin:mainfrom
eldernewborn:eldernewborn/refactor-schema-services

Conversation

@eldernewborn

@eldernewborn eldernewborn commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Problem Statement

VeniceHelixAdmin and VeniceParentHelixAdmin had grown into god-classes that, among many other responsibilities, owned all store-schema logic inline:

  • The child controller (VeniceHelixAdmin) carried the store schema read/write/validation operations directly in its body — add value/derived schema, in-use/delete value schema, superset resolution, RMD (replication-metadata) presence checks and registration.
  • The parent controller (VeniceParentHelixAdmin) carried the cross-colo schema orchestration (pre-condition checks, admin-message broadcast, RMD generation/propagation) interleaved with the rest of its admin surface.
  • The two were coupled through a layer of thin pass-through wrappers on VeniceHelixAdmin (checkPreConditionForAddValueSchema*, checkPreConditionForAddDerivedSchema*, getValueSchemaIdIgnoreFieldOrder, checkIfValueSchemaAlreadyHasRmdSchema, checkIfMetadataSchemaAlreadyPresent,
    getSupersetOrLatestValueSchema) that existed only so the parent could reach schema logic.

This made the schema code hard to read, hard to test in isolation, and gave it almost no direct unit coverage (diff branch coverage of the changed lines started at 7.89%). It also obscured one genuine semantic bug in the A/A path (see below).

Theomachy 3, −800 LOC across the two helix-admin god-classes

Solution

Extract schema responsibilities into two focused, directly-testable classes and route callers to them, then fix the A/A RMD correctness bug the refactor surfaced.

Extraction (behavior-preserving):

  • New StoreSchemaService owns the child-controller store-schema operations (reads, in-use/delete value schemas, value/derived id validation gates, add-value duplicate/explicit-id paths, superset present/match/mismatch resolution, RMD presence checks, add/skip replication-metadata schema,
    superset-or-latest). VeniceHelixAdmin's public Admin schema methods now delegate to it.
  • New ParentSchemaOrchestrator owns the parent-controller schema orchestration (pre-condition checks, admin-message broadcast, RMD generation/propagation, DELETE_UNUSED_VALUE_SCHEMA orchestration). VeniceParentHelixAdmin's schema methods are now thin delegates.
  • Removed the pass-through wrappers from VeniceHelixAdmin and replaced them with a single getStoreSchemaService() accessor; ParentSchemaOrchestrator depends on StoreSchemaService directly.
  • Dropped the dead valueSchemaId parameter from checkIfMetadataSchemaAlreadyPresent (the id is already carried by the RmdSchemaEntry argument and compared via equals()).
  • Method bodies, logging, exceptions, and admin-message/lock semantics are unchanged; this part of the change is purely structural.

A/A RMD correctness fix:

  • For Active/Active stores, addValueSchema now registers the RMD schema generated from the just-added value schema rather than from the unchanged superset schema. The pinned test (testActiveActiveAddValueSchemaRegistersRmdGeneratedFromNewValueSchemaNotSuperset) drives the full registration path and
    asserts the RMD is generated from the new value schema ({f1}) and not the superset ({f1, f2}).

Code changes

  • Added new code behind a config. — N/A, no new config.
  • Introduced new log lines. — N/A, existing log lines moved verbatim with the code; none added.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues. Lock acquisition and admin-message ordering are preserved exactly; only the code's location changed.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed. Same locks held over the same critical sections as before extraction.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination. Exception/swallow semantics (e.g. in the RMD presence check) preserved as-is.

How was this PR tested?

  • New unit tests added. TestStoreSchemaService (25 tests over the child-controller schema operations) and TestParentSchemaOrchestrator (relocated parent schema tests plus new branch tests for duplicate short-circuits, RMD broadcast/validate, RMD already-present skip, update-RMD-for-all skip, and
    the A/A RMD-source assertion). This raised diff branch coverage of the changed lines from 7.89% to 58.77% (above the 45% gate).
  • New integration tests added. TestParentControllerWithMultiDataCenter gains coverage for the A/A subset-schema RMD path.
  • Modified or extended existing tests. Moved the three schema tests out of TestVeniceParentHelixAdmin into TestParentSchemaOrchestrator; updated TestVeniceHelixAdminWithoutCluster and AbstractTestVeniceParentHelixAdmin for the new wiring.
  • Verified backward compatibility. The extraction is behavior-preserving; delegating methods keep the same public Admin signatures.

Does this PR introduce any user-facing or breaking changes?

  • No.
  • Yes. For Active/Active stores, addValueSchema now derives the registered replication-metadata schema from the newly added value schema instead of the existing superset schema. Previously, adding a value schema whose fields were a subset of the superset could register an RMD schema generated from
    the superset; it now correctly reflects the added value schema. No public API signature changes; the difference is in which RMD schema content is broadcast/registered.

@eldernewborn eldernewborn marked this pull request as ready for review June 3, 2026 00:22
@eldernewborn eldernewborn enabled auto-merge (squash) June 3, 2026 05:19
@eldernewborn eldernewborn changed the title [controller][test] Extract StoreSchemaService and ParentSchemaOrchest… [controller][test] Extract StoreSchemaService and ParentSchemaOrchestrator; fix A/A RMD registration Jun 3, 2026
@eldernewborn eldernewborn force-pushed the eldernewborn/refactor-schema-services branch 2 times, most recently from e0d335a to cd16e9e Compare June 4, 2026 18:45
@eldernewborn eldernewborn requested review from FelixGV and xunyin8 June 4, 2026 19:35
@eldernewborn eldernewborn changed the title [controller][test] Extract StoreSchemaService and ParentSchemaOrchestrator; fix A/A RMD registration [controller][schema][test] Extract StoreSchemaService and ParentSchemaOrchestrator; fix A/A RMD registration Jun 5, 2026
@eldernewborn eldernewborn disabled auto-merge June 5, 2026 02:41
@eldernewborn eldernewborn enabled auto-merge (squash) June 5, 2026 02:41
Ali Poursamadi and others added 9 commits June 7, 2026 10:58
…rator

from Venice(Parent)HelixAdmin

Move the child-controller store schema operations into StoreSchemaService and
the
parent-controller schema orchestration into ParentSchemaOrchestrator. The
public Admin
schema methods on VeniceHelixAdmin / VeniceParentHelixAdmin now delegate to
these classes.
Behavior-preserving: method bodies, logging, exceptions, and
admin-message/lock semantics
are unchanged. Adds a regression test verifying addValueSchema still
broadcasts a
DERIVED_SCHEMA_CREATION admin message when write computation is enabled.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ParentSchemaOrchestrator

Add dedicated unit tests for the schema classes extracted from
Venice(Parent)HelixAdmin, raising diff branch coverage for the changed lines
from
7.89% to 58.77% (above the 45% gate).

- TestStoreSchemaService: 25 tests covering the child-controller schema
operations
  (reads, in-use/delete value schemas, value/derived id validation gates,
add-value
  duplicate/explicit-id paths, superset present/match/mismatch, RMD presence
checks
  incl. exception-swallow, add/skip replication-metadata schema,
superset-or-latest).
  Mocks the VeniceHelixAdmin back-reference plus ReadWriteSchemaRepository
and config.
- TestParentSchemaOrchestrator: relocates the three parent schema tests
  (testAddValueSchema,
testAddValueSchemaWithWriteComputePropagatesDerivedSchema,
  testAddDerivedSchema) out of TestVeniceParentHelixAdmin and adds five
branch tests
  (value/derived duplicate short-circuits, RMD broadcast+validate, RMD
already-present
  skip, update-RMD-for-all skip).
- TestVeniceParentHelixAdmin: drop the three moved tests and their now-unused
imports.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… fix

Add
testActiveActiveAddValueSchemaRegistersRmdGeneratedFromNewValueSchemaNotSupers
et
to TestParentSchemaOrchestrator. The existing sibling test pins which
value-schema id
drives RMD registration but short-circuits
(valueSchemaAlreadyHasRmdSchema=true) before
any RMD schema is generated. This test drives the full registration path
(valueSchemaAlreadyHasRmdSchema=false), captures the broadcast
REPLICATION_METADATA_SCHEMA_CREATION admin message, and asserts the RMD
schema is
generated from the just-added value schema ({f1}) rather than the unchanged
superset
({f1, f2}) -- the actual semantic of the intentional A/A fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Slim VeniceHelixAdmin's schema surface and let ParentSchemaOrchestrator
depend on
StoreSchemaService directly instead of reaching it through thin pass-through
methods.

- Remove the package-private delegating wrappers on VeniceHelixAdmin
  (checkPreConditionForAddValueSchema*, checkPreConditionForAddDerivedSchema*,
  getValueSchemaIdIgnoreFieldOrder, checkIfValueSchemaAlreadyHasRmdSchema,
  checkIfMetadataSchemaAlreadyPresent, getSupersetOrLatestValueSchema) and
expose a
  single getStoreSchemaService() accessor; ParentSchemaOrchestrator now calls
  StoreSchemaService directly. Behavior is unchanged since the wrappers were
pure
  pass-throughs to the same instance.
- Move the DELETE_UNUSED_VALUE_SCHEMA orchestration out of
  VeniceParentHelixAdmin.deleteValueSchemas into ParentSchemaOrchestrator,
leaving a
  thin delegate consistent with the other parent schema-write methods.
- Add unit coverage for the deleteValueSchemas broadcast and its two in-use
guard paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
checkIfMetadataSchemaAlreadyPresent

The value-schema id is already carried by the RmdSchemaEntry argument and is
compared
via equals(), so the standalone valueSchemaId parameter on
StoreSchemaService.checkIfMetadataSchemaAlreadyPresent was dead. Remove it
and update
both callers (ParentSchemaOrchestrator, the internal
addReplicationMetadataSchema) and
the unit tests. Behavior is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@eldernewborn eldernewborn force-pushed the eldernewborn/refactor-schema-services branch from 5c1404e to 20a55cd Compare June 7, 2026 18:27
@pthirun

pthirun commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Can we land this as two PRs?
First the pure extraction (provably no-op — git diff should show only moves + delegation), then the A/A RMD fix as a ~3-line change plus testActiveActiveAddValueSchemaRegistersRmdGeneratedFromNewValueSchemaNotSuperset / the integration test. It should clear the Max Lines Added CI failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants