Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pull-based Ingestion] Support segment replication for pull-based ingestion #17359

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Feb 14, 2025

Description

This PR is a follow up for pull-based-ingestion to support segment replication with remote store. The primary shard will ingest from the streaming source and replica shards will rely on segment replication.

This PR refactors IngestionEngine to inherit from InternalEngine to support replication, recovery and avoid duplicate code. Some of the changes required to support segRep and peer recovery are enhancing IngestionEngine to include required listeners, support working with NRTReplicationEngine, tracking latest index commits, prevent snapshotted index deletion, among many others. These changes are already available in InternalEngine, and can be reused by IngestionEngine after this change.

Integration tests are added to validate end-to-end pull-based ingestion with segment replication, peer recover, replica promotion and remote store.

Related Issues

Resolves #16929

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing labels Feb 14, 2025
Copy link
Contributor

❌ Gradle check result for 7a682ef: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 5db9a2a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 57b86ed: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for a2f9dc2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for c5adb5e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 2d4be95: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj changed the title [WIP][Pull-based Ingestion] Support segment replication for pull-based ingestion [Pull-based Ingestion] Support segment replication for pull-based ingestion Feb 16, 2025
Copy link
Contributor

❌ Gradle check result for bc9716c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj force-pushed the vb/segrep branch 2 times, most recently from b758603 to a7c7a99 Compare February 21, 2025 04:05
Copy link
Contributor

❌ Gradle check result for a7c7a99: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for fae7e91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

*
* @opensearch.internal
*/
public interface EngineTranslogManager extends TranslogManager, Closeable {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't very clear for implementors to choose between EngineTranslogManager and TranslogManager except for some additionally exposed methods. Is the rationale to avoid breaking a breaking change for existing implementors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to provide a no-op translog manager implementation for InternalTranslogManager in InternalEngine. Since InternalTranslogManager is defined as an implementation capable of interfacing with InternalEngine, we create EngineTranslogManager. But open to other names / suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this appears to be what we have:

  • TranslogManager - base interface
  • EngineTranslogManager - extended interface for interaction with Engine instances
    • InternalTranslogManager - impl that interfaces with InternalEngine
      • WriteOnlyTranslogManager - impl that interfaces NRTReplicationEngine
    • NoOpInternalTranslogManager - no-op impl that can interface with any Engine
  • NoOpTranslogManager - generic no-op impl

@varunbharadwaj I guess I would suggest one of two things:

  • just move the new methods into the base TranslogManager interface. It looks like the only existing impl you'd need to change would be NoOpTranslogManager.
  • keep the existing structure you've defined, but do the following renames for clarity:
    • InternalTranslogManager -> InternalEngineTranslogManager
    • WriteOnlyTranslogManager -> WriteOnlyInternalEngineTranslogManager
    • NoOpInternalTranslogManager -> NoOpEngineTranslogManager

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion @varunbharadwaj, but I meant to pick one of the options above. If you don't introduce the new EngineTranslogManager interface then I don't think you need to rename the classes. The renaming was to make the hierarchy clearer, but without that intermediate interface I don't think you need to name things with the "EngineTranslogManager" suffix. What do you think?

Copy link
Contributor

❌ Gradle check result for 22464ed: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there are some build failures here. FYI you should be able to run ./gradlew precommit locally to find these before pushing your commit.

*
* @opensearch.internal
*/
public interface EngineTranslogManager extends TranslogManager, Closeable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this appears to be what we have:

  • TranslogManager - base interface
  • EngineTranslogManager - extended interface for interaction with Engine instances
    • InternalTranslogManager - impl that interfaces with InternalEngine
      • WriteOnlyTranslogManager - impl that interfaces NRTReplicationEngine
    • NoOpInternalTranslogManager - no-op impl that can interface with any Engine
  • NoOpTranslogManager - generic no-op impl

@varunbharadwaj I guess I would suggest one of two things:

  • just move the new methods into the base TranslogManager interface. It looks like the only existing impl you'd need to change would be NoOpTranslogManager.
  • keep the existing structure you've defined, but do the following renames for clarity:
    • InternalTranslogManager -> InternalEngineTranslogManager
    • WriteOnlyTranslogManager -> WriteOnlyInternalEngineTranslogManager
    • NoOpInternalTranslogManager -> NoOpEngineTranslogManager

What do you think?

Copy link
Contributor

✅ Gradle check result for f56f90f: SUCCESS

Copy link

codecov bot commented Feb 26, 2025

Codecov Report

Attention: Patch coverage is 60.24096% with 33 lines in your changes missing coverage. Please review.

Project coverage is 72.42%. Comparing base (5666982) to head (f56f90f).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
...arch/index/translog/NoOpEngineTranslogManager.java 42.85% 20 Missing ⚠️
...opensearch/index/translog/NoOpTranslogManager.java 0.00% 8 Missing ⚠️
...a/org/opensearch/index/engine/IngestionEngine.java 86.36% 1 Missing and 2 partials ⚠️
.../indices/pollingingest/IngestionEngineFactory.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17359      +/-   ##
============================================
- Coverage     72.49%   72.42%   -0.08%     
+ Complexity    65717    65669      -48     
============================================
  Files          5303     5305       +2     
  Lines        304793   304634     -159     
  Branches      44202    44177      -25     
============================================
- Hits         220966   220627     -339     
- Misses        65693    65938     +245     
+ Partials      18134    18069      -65     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing Indexing, Bulk Indexing and anything related to indexing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] A new IngestionEngine that can pull data from streaming sources.
4 participants