Skip to content

Conversation

@kvargha
Copy link
Contributor

@kvargha kvargha commented Feb 6, 2026

Problem Statement

  1. seekToBeginningOfPush() incorrectly resumed from persisted offsets instead of starting from the beginning
  2. When DVRT CDC was used, RocksDB block cache was automatically set to 0 globally, disabling block cache for ALL DaVinci stores (not just CDC stores)
  3. Push status store configuration didn't align with stateful/stateless flag
  4. Integration tests failed in memory-constrained environments due to excessive block cache allocation (16GB on 15.6GB systems)

Solution

  1. Fixed seekToBeginningOfPush() to properly start from beginning of push using PubSubSymbolicPosition.EARLIEST
  2. Added per-store block cache disable option in DaVinciConfig (default: false) so CDC stores can disable block cache independently without affecting other DaVinci stores
  3. Aligned push status store config with isStateful() flag
  4. Fixed test setup to use block cache size=0 to prevent memory allocation failures

Code changes

  • Added new code behind a config. If so list the config names and their default values in the PR description.
    • DaVinciConfig: per-store block cache disable option (default: false)
  • Introduced new log lines.
    • Confirmed if logs need to be rate limited to avoid excessive logging.

Concurrency-Specific Checks

Both reviewer and PR author to verify

  • Code has no race conditions or thread safety issues.
  • Proper synchronization mechanisms (e.g., synchronized, RWLock) are used where needed.
  • No blocking calls inside critical sections that could lead to deadlocks or performance degradation.
  • Verified thread-safe collections are used (e.g., ConcurrentHashMap, CopyOnWriteArrayList).
  • Validated proper exception handling in multi-threaded code to avoid silent thread termination.

How was this PR tested?

  • New unit tests added.
  • New integration tests added.
  • Modified or extended existing tests.
  • Verified backward compatibility (if applicable).

Does this PR introduce any user-facing or breaking changes?

  • No. You can skip the rest of this section.
  • Yes. Clearly explain the behavior change and its impact.

kvargha and others added 7 commits February 2, 2026 13:46
version-specific flag

Changed the logic in
VeniceChangelogConsumerDaVinciRecordTransformerImpl.buildVeniceConfig()
to set PUSH_STATUS_STORE_ENABLED based on changelogClientConfig.isStateful()
instead of
!isVersionSpecificClient. This aligns the push status store configuration
with whether the
client is stateful or stateless.

Also added unit tests to verify the behavior and renamed
statefulVeniceChangelogConsumer
to veniceChangelogConsumer in tests for clarity.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
configuration

Add DaVinciConfig option to disable RocksDB block cache for individual stores.
Beneficial for sequential access patterns where data is read once (e.g.,
sequential
scans, changelog consumers). Prevents cache pollution of the shared block
cache used
by other DaVinci stores and reduces memory pressure. CDC clients
automatically enable
this optimization since they only read each key once from disk.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…roperties

Make buildVeniceConfig() method final in
VeniceChangelogConsumerDaVinciRecordTransformerImpl
to prevent overriding in tests. Refactor changelog consumer tests to use
shared consumer
properties initialized in setUp() instead of recreating them in each test
method. Add
ROCKSDB_BLOCK_CACHE_SIZE_IN_BYTES configuration set to 0 to disable block
cache in tests.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…nning

Previously, seekToBeginningOfPush delegated to subscribe() which passed null
for checkpointInfo, causing the ingestion task to resume from persisted
offsets
on disk instead of starting from the beginning of the push.

This change implements proper seekToBeginningOfPush by:
- Adding seekToBeginningOfPush flag to DaVinciSeekCheckpointInfo
- Converting the flag to PubSubSymbolicPosition.EARLIEST in StoreBackend
- Adding seekToBeginningOfPush() method to AvroGenericDaVinciClient
- Updating VeniceChangelogConsumerDaVinciRecordTransformerImpl to use
  initializeAndSubscribe pattern (like seekToCheckpoint/seekToTimestamps)
- Making subscribe()/subscribeAll() delegate to seekToBeginningOfPush()
  to ensure stateless CDC consumers start from the beginning

Added comprehensive tests:
- 9 unit tests for DaVinci client and changelog consumer
- 1 integration test for DVRT-based stateless changelog consumer

Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>
Set ROCKSDB_BLOCK_CACHE_SIZE_IN_BYTES to 0 in consumer properties during
setUp to prevent 'Cannot setup rocksdb instance with block-cache size 16.0
GiB. System memory: 15.6 GiB' error in test environments with limited memory.

Changes:
- TestVersionSpecificChangelogConsumer: Initialize consumerProperties in
setUp with block cache disabled
- TestActiveActiveVersionSwapMessage: Initialize consumerProperties in setUp
with block cache disabled
- Both tests now follow the pattern established in TestChangelogConsumer

All integration tests passing.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>
Add comprehensive test for RocksDBStoragePartition disableBlockCache feature
to improve diff coverage from 41% to 50% (exceeds 45% threshold).

The test validates:
- StoragePartitionConfig getter/setter for disableBlockCache
- RocksDB partition creation with cache disabled vs enabled
- Read/write operations work correctly in both configurations
- Default value is false

This ensures the per-store RocksDB block cache disable feature (added in
243b26e) has proper test coverage and prevents regressions.

Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>
@kvargha kvargha enabled auto-merge (squash) February 6, 2026 19:04
@kvargha kvargha disabled auto-merge February 12, 2026 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant