Skip to content

Conversation

@sixpluszero
Copy link
Contributor

@sixpluszero sixpluszero commented Feb 11, 2026

Problem Statement

The Ingestion Isolation feature allowed running ingestion in a separate forked JVM process, communicating via HTTP/Netty IPC. This feature is no longer needed and adds significant complexity and maintenance burden to the codebase (~6,300 lines of code across ~30 files).

Solution

Remove all Ingestion Isolation related code, including:

Deleted files (~30 production + test files):

  • IsolatedIngestionBackend, IsolatedIngestionServer, IsolatedIngestionServerHandler, and related classes in ingestion/isolated/
  • MainIngestionMonitorService, MainIngestionRequestClient, MainIngestionStorageMetadataService, and related classes in ingestion/main/
  • IsolatedIngestionUtils, HttpClientTransport, RelayNotifier
  • IsolatedIngestionProcessStats, IsolatedIngestionProcessHeartbeatStats, MetadataUpdateStats
  • IngestionMode and IngestionAction enums
  • All corresponding unit and integration test files

Simplified production code:

  • Removed all SERVER_INGESTION_ISOLATION_* config keys from ConfigKeys
  • Simplified DaVinciBackend to always use DefaultIngestionBackend and StorageEngineMetadataService
  • Removed isIsolatedIngestion parameter from StoreIngestionTask, KafkaStoreIngestionService, BlobTransferUtils, VeniceMetadataRepositoryBuilder, and related classes
  • Removed ingestion isolation guards from DefaultIngestionBackend, AvroGenericDaVinciClient, and VersionBackend
  • Cleaned up SpotBugs exclusions, server properties, and build configs

Note: Avro schemas and AvroProtocolDefinition enum entries are retained for backward compatibility. DaVinciConfig.isIsolated() (client subscription isolation) is a different feature and is NOT removed.

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
    • No new configs added. Only removed deprecated SERVER_INGESTION_MODE and SERVER_INGESTION_ISOLATION_* configs.
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.
    • No new log lines introduced. This is a deletion-only change.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

No concurrency-related changes; this is strictly code removal and simplification.

How was this PR tested?

  • Modified or extended existing tests.

  • Verified backward compatibility (if applicable).

  • All production code compiles successfully

  • All da-vinci-client unit tests pass

  • All integration test code compiles successfully

  • Grep verification confirms no stale references to removed code remain

Does this PR introduce any user-facing or breaking changes?

  • Yes. Clearly explain the behavior change and its impact.

Users who had SERVER_INGESTION_MODE=ISOLATED configured will no longer have ingestion isolation enabled. The SERVER_INGESTION_MODE and all SERVER_INGESTION_ISOLATION_* config keys are removed. Ingestion will always run in the built-in (default) mode.

VALIDATION_OVERRIDE

Remove the Ingestion Isolation feature which enabled running ingestion
in a separate forked JVM process communicating via HTTP/Netty IPC. This
feature is no longer needed, and removing it reduces codebase complexity
and maintenance burden.

- Delete ~30 production and test files: IsolatedIngestionBackend,
  IsolatedIngestionServer, MainIngestionMonitorService, and related
  classes in ingestion/isolated/, ingestion/main/, and utils/
- Delete IngestionMode and IngestionAction enums
- Remove all SERVER_INGESTION_ISOLATION_* config keys from ConfigKeys
- Simplify DaVinciBackend to always use DefaultIngestionBackend and
  StorageEngineMetadataService
- Remove isIsolatedIngestion parameter from StoreIngestionTask,
  KafkaStoreIngestionService, BlobTransferUtils, and
  VeniceMetadataRepositoryBuilder
- Remove ingestion isolation guards from DefaultIngestionBackend,
  AvroGenericDaVinciClient, and VersionBackend
- Remove 6 ingestion isolation Avro protocol definitions and their
  avsc schemas (IngestionTaskCommand, IngestionTaskReport,
  IngestionMetricsReport, IngestionStorageMetadata,
  ProcessShutdownCommand, LoadedStoreUserPartitionMapping)
- Remove unused StorageService methods (getStoreVersionStateSyncer,
  closeStorePartition) that were only used by ingestion isolation
- Clean up SpotBugs exclusions, server properties, and build configs
- Update integration tests to remove isolation-specific parameterization

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@sixpluszero sixpluszero force-pushed the remove-ingestion-isolation-feature branch from fecf7f2 to b3e9d66 Compare February 11, 2026 08:50
@sixpluszero sixpluszero marked this pull request as ready for review February 12, 2026 01:55
Copy link
Contributor

@KaiSernLim KaiSernLim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think if it builds and all the tests pass, then it should be good. i took a look through all the files and it lgtm

@sixpluszero
Copy link
Contributor Author

Code review

No blocking issues found. Two low-confidence observations noted below:

  1. Stale error message: daVinciConfig.isIsolated() refers to client subscription isolation (not ingestion isolation), but the exception message still says "Ingestion Isolation is not supported with DaVinciRecordTransformer". Consider updating the message to accurately describe client subscription isolation.

if (daVinciConfig.isIsolated() && recordTransformerConfig != null) {
// When both are enabled, this causes the storage engine to be deleted everytime the client starts,
// since the record transformer config is never persisted to disk. Additionally, this will spawn multiple
// transformers per version, and if the user's transformer is stateful this could cause issues.
throw new VeniceClientException("Ingestion Isolation is not supported with DaVinciRecordTransformer");
}

  1. serverConfig is dereferenced without a null check in DefaultIngestionBackend.startConsumption() after the isIsolatedIngestion guard was removed. This is a pre-existing issue (the old guard didn't actually protect against null serverConfig), and production always provides a non-null value, so risk is low.

if (replicaContext.state == ReplicaIntendedState.STOPPED) {
LOGGER.info(
"startConsumption: Waiting for blob transfer and PubSub consumption to stop for replica {}.",
replicaId);
// TODO: Refactor the ingestion service to take in blob ingestion/transfer, pubsub ingestion logic.
int stopConsumptionTimeout = serverConfig.getStopConsumptionTimeoutInSeconds();
stopBlobTransferAndWait(storeConfig, partition, stopConsumptionTimeout);
getStoreIngestionService().stopConsumptionAndWait(storeConfig, partition, 1, stopConsumptionTimeout, false);
}

Checked for bugs, git history context, previous PR comments, code comment compliance, and CLAUDE.md compliance (none found in repo).

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants