[server][dvc] Fail-fast on blob-transfer PartitionState/StoreVersionState schema-version mismatch#2811
Open
jingy-li wants to merge 12 commits into
Open
[server][dvc] Fail-fast on blob-transfer PartitionState/StoreVersionState schema-version mismatch#2811jingy-li wants to merge 12 commits into
jingy-li wants to merge 12 commits into
Conversation
The previous commit added a fast-fail check on the metadata response, but the server's response order is files first, metadata last — so a client that catches the mismatch at the metadata stage has already paid for the entire file transfer. This commit moves the primary check to the request side, modeled on the existing snapshot-table-format check next to it. Client `prepareRequest` now stamps the two schema-version headers on the GET. Server `channelRead0` calls a new `BlobTransferUtils.compareRequestedSchemaVersionsAgainstLocal(...)` right after the table-format validation. On mismatch it returns 400 BAD_REQUEST with an `X-Blob-Transfer-Schema-Mismatch: true` marker header and echoes its own protocol versions in the response, BEFORE any file work begins. Client `P2PFileTransferClientHandler.channelRead0` recognizes the marker on a non-OK response and throws the typed `VeniceBlobTransferIncompatibleSchemaException` with full peer-vs-local context (peer = server's local versions echoed in the rejection). Backward compat preserved: a request without the new headers (older client) is not rejected — the server falls through to the existing flow. The response-side check from the prior commit stays as the safety net for the inverse case (older server that does not yet validate requests will still send full metadata; new client catches at metadata-parse time instead of waiting for the receive timeout). Tests cover: server rejects mismatched request with 400 + marker + no file responses; server doesn't reject requests without the headers (backward compat); client builds the typed exception from the rejection response with correct peer/local versions populated. Existing integration tests in TestNettyP2PBlobTransferManager (real client↔server flow) all pass unchanged because the new client always advertises versions that match the new server. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> [blob-transfer] add fast-fail schema version check on metadata response The P2P blob-transfer client deserialized the peer's PartitionState bytes inside P2PMetadataTransferHandler without a SchemaReader, so any peer that serialized PartitionState with a protocol version higher than this binary knows triggered VeniceMessageException at the Netty pipeline tail after the body had been fully transferred. The transfer future was never failed explicitly, so the replica had to wait for blobReceiveTimeoutInMin before falling back to Kafka bootstrap. We saw this on ltx1-app12860.stg with PartitionState v21 during the rolling deploy that introduced PR linkedin#2707. Server now stamps two new headers on the metadata response: X-Blob-Transfer-Partition-State-Schema-Version X-Blob-Transfer-Store-Version-State-Schema-Version each carrying the local AvroProtocolDefinition.X.getCurrentProtocolVersion(). Client compares peer's value to its own current version at HTTP header-parse time, before any body is consumed, and throws the new VeniceBlobTransferIncompatibleSchemaException on mismatch. Throwing from channelRead0 flows through the existing exceptionCaught -> completeExceptionally -> ctx.close() path, so the per-host transfer future fails immediately and the orchestrator can pick the next peer (or fall back to Kafka) without waiting on the receive timeout. Policy is exact equality. Blob transfer is the fast path; if binaries are not in lock-step we'd rather skip P2P and let Kafka handle bootstrap than rely on cross-version metadata promotion. Skew is bounded to rolling-deploy windows. Backward-compat is preserved during rollout: a missing header passes through (peer not yet upgraded), and a malformed/out-of-range header logs a warning and passes through (parse bug must not crash the channel). The existing deserialize-time exception in InternalAvroSpecificSerializer remains as the safety net for the truly incompatible case, so no regression vs. today. Tests cover: header stamped on metadata response only (not on file responses), known-version pass-through, mismatched PartitionState fails fast, mismatched StoreVersionState fails fast, single-header-missing pass-through, and malformed/out-of-range pass-through. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VeniceBlobTransferIncompatibleSchemaException Covers the constructor, all getters, message formatting, and both branches of render() (known version and VERSION_UNKNOWN sentinel). Bumps venice-client-common diff branch coverage from 91.6% to 92.74%.
testIsVeniceException compared a VeniceBlobTransferIncompatibleSchemaException against VeniceException via instanceof. Since the class extends VeniceException, the relationship is enforced at compile time and SpotBugs flags the runtime check as vacuous (BC_VACUOUS_INSTANCEOF). The remaining three tests still cover the constructor, all getters, both branches of render(), and the VERSION_UNKNOWN sentinel.
There was a problem hiding this comment.
Pull request overview
This PR adds an early “protocol/schema version negotiation” for P2P blob-transfer partition metadata so peers can fail fast on PartitionState / StoreVersionState Avro protocol-version mismatches before any snapshot files are transferred, enabling quicker fallback to another peer or Kafka bootstrap during rolling deploy skew.
Changes:
- Added request/response headers to advertise
PartitionStateandStoreVersionStateprotocol versions, plus a server-side early rejection path with a marker header. - Added a typed
VeniceBlobTransferIncompatibleSchemaExceptioncarrying peer/local versions for both protocols, thrown on both the metadata fast-fail path and the server 400 mismatch path. - Added/extended unit tests for server and client handlers to cover mismatch, match, and backward-compatibility behaviors (missing/malformed headers).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/venice-client-common/src/main/java/com/linkedin/venice/exceptions/VeniceBlobTransferIncompatibleSchemaException.java | Introduces typed exception carrying peer/local protocol versions and a diagnostic message. |
| internal/venice-client-common/src/test/java/com/linkedin/venice/exceptions/VeniceBlobTransferIncompatibleSchemaExceptionTest.java | Unit tests for exception fields and message rendering (including <unknown> sentinel). |
| clients/da-vinci-client/src/main/java/com/linkedin/davinci/blobtransfer/BlobTransferUtils.java | Defines new blob-transfer version/mismatch headers and provides client/server validation helpers. |
| clients/da-vinci-client/src/main/java/com/linkedin/davinci/blobtransfer/server/P2PFileTransferServerHandler.java | Adds server-side pre-file-work schema version gating and echoes local versions on 400 mismatch. |
| clients/da-vinci-client/src/main/java/com/linkedin/davinci/blobtransfer/client/P2PFileTransferClientHandler.java | Adds client-side fast-fail on metadata header mismatch and typed exception handling on 400 mismatch. |
| clients/da-vinci-client/src/main/java/com/linkedin/davinci/blobtransfer/client/NettyFileTransferClient.java | Advertises local protocol versions on the outgoing GET request headers. |
| clients/da-vinci-client/src/test/java/com/linkedin/davinci/blobtransfer/TestP2PFileTransferServerHandler.java | Adds server tests for mismatch rejection, matching versions success, and missing-headers pass-through. |
| clients/da-vinci-client/src/test/java/com/linkedin/davinci/blobtransfer/TestP2PFileTransferClientHandler.java | Adds client tests for metadata mismatch fast-fail, old-peer missing headers, and 400 mismatch typed exception. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sendSchemaMismatchResponse: use StandardCharsets.UTF_8 for the body bytes and set Content-Type: text/plain so the diagnostic response no longer depends on the platform default charset and matches the NettyUtils.setupResponseAndFlush convention used by sibling error paths in this handler. parseProtocolVersionHeader: lower the malformed/out-of-range WARN logs to DEBUG. The header values are peer-controlled and could be hit on every request, so a misbehaving peer would otherwise produce unbounded WARN spam. VERSION_UNKNOWN already routes through the caller's pass-through path, the actual mismatch case is logged at WARN by the caller with full peer-host context, and malformed values that slip through still surface via the existing deserialization-time exception.
sixpluszero
reviewed
May 27, 2026
added 6 commits
May 28, 2026 12:48
The server-side gate already rejects a mismatched request with a 400 + BLOB_TRANSFER_SCHEMA_MISMATCH marker before any file work begins. By the time the client receives a metadata response, the versions are known to match (NEW server) or were never advertised (OLD server didn't stamp headers either), so the client-side check at metadata-parse time was always dead. Remove the client-side validateMetadataResponseSchemaVersions call, delete the now-unused BlobTransferUtils method, stop stamping the version headers on the server's metadata response, and drop the tests that exercised the dead path. Update the constants' javadoc to reflect that they now only appear on the client request and the server's 400 echo. The live fail-fast path is unchanged: client advertises versions on the request, server gate rejects mismatched requests with 400 + marker + echoed versions, client builds the typed VeniceBlobTransferIncompatibleSchemaException from those echoed versions.
…tion Replace the hand-rolled sendSchemaMismatchResponse helper with a direct NettyUtils.setupResponseAndFlush call. Switch the rejection status from 400 BAD_REQUEST + custom marker header to 412 PRECONDITION_FAILED, which is semantically a precondition (the requester's advertised schema versions don't satisfy the server's required versions) and lets the client differentiate purely on status without parsing a custom header. Drop now-redundant machinery: - BLOB_TRANSFER_SCHEMA_MISMATCH marker header constant. - Server-side echoing of local schema versions on the rejection response (the body string already carries the same diagnostic). - Client-side buildSchemaMismatchExceptionFromErrorResponse and readVersionHeader helpers; the client now branches on status code. - Per-protocol peer/local version fields and accessors on VeniceBlobTransferIncompatibleSchemaException; the exception is now (peerHost, message) with the diagnostic body as the message. - VERSION_UNKNOWN moved from VeniceBlobTransferIncompatibleSchemaException to BlobTransferUtils as a private sentinel — it was only used by the pass-through path in parseProtocolVersionHeader. The live fast-fail path is unchanged: client advertises versions on the request, server compares against local and returns 412 with a diagnostic body before any file work, client throws the typed exception on 412 so the orchestrator can fall over to the next peer or to Kafka bootstrap.
handlePeerFetchException currently cleans up both the partition directory and the temp transferred dir in its default branch. For a schema-version mismatch the server rejects with 412 PRECONDITION_FAILED before any file work begins, so there is nothing on disk to clean up from this attempt — and wiping the dir would also clobber whatever the caller had there for unrelated reasons. Add a VeniceBlobTransferIncompatibleSchemaException branch that logs and returns without touching the partition dir, so the chain-of-peers walker can advance to the next peer at the speed of the 412 round-trip. Cover both the new path and the contrast (generic exception still triggers the cleanup) in TestNettyP2PBlobTransferManager.
The previous commit relaxed handlePeerFetchException to package-private just so a test could invoke it directly. That bleeds a test convenience into the production API. Restore private and rewrite the test to drive the schema-mismatch branch through the public manager.get(...) surface by stubbing the NettyFileTransferClient spy to return a future that completes with VeniceBlobTransferIncompatibleSchemaException — the same pattern testFailedConnectPeer and siblings already use for the other branches. The post-condition (canary file in the partition dir survives the failed attempt) stays the same.
VeniceBlobTransferIncompatibleSchemaException extends VeniceException, so the runtime instanceof check is provably true at compile time and SpotBugs flags it (BC_VACUOUS_INSTANCEOF, plus a duplicate detector hit). This is the same issue prior commit eb767b6 already removed once; my exception-test rewrite re-introduced it. Drop the test — the inheritance is enforced by the compiler, no runtime assertion adds coverage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem Statement
P2P blob transfer ships RocksDB snapshot files plus a BlobTransferPartitionMetadata payload that embeds Avro-serialized
PartitionState (offset record) and StoreVersionState. Today the two peers do not negotiate the protocol versions they used to serialize that metadata, the receiver only discovers a mismatch when it tries to deserialize the body, which is:
This is the fast path during rolling deploys, where peer-vs-local binary skew on PartitionState / StoreVersionState versions is exactly when this misfires.
Solution
Both sides (client/server) advertise their compiled-in AvroProtocolDefinition.{PARTITION_STATE,STORE_VERSION_STATE}.getCurrentProtocolVersion() and check at the earliest possible point:
Policy is exact equality (not "peer ≤ local"): blob transfer is the fast path; on skew we'd rather step aside to Kafka bootstrap than rely on cross-version Avro promotion of partition metadata.
Rolling-deploy compatibility is intentional:
Code changes
Concurrency-Specific Checks
Both reviewer and PR author to verify
synchronized,RWLock) are used where needed.ConcurrentHashMap,CopyOnWriteArrayList).How was this PR tested?
Does this PR introduce any user-facing or breaking changes?