-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search only replicas (scale to zero) with Reader/Writer Separation #17299
base: main
Are you sure you want to change the base?
Search only replicas (scale to zero) with Reader/Writer Separation #17299
Conversation
Signed-off-by: Prudhvi Godithi <[email protected]>
Signed-off-by: Prudhvi Godithi <[email protected]>
Signed-off-by: Prudhvi Godithi <[email protected]>
❌ Gradle check result for e89b812: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
While I refactor the code and add additional tests, I’m creating this PR to gather early feedback please take a look and add your thoughts. I will share the testing results in the comments. Thanks! |
❌ Gradle check result for 1bd7c6a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
I went through and tested the following scenariosScenario 1: Search-Only Replicas Recovery with Persistent Data Directory and when
|
Coming from #17299 (comment) @shwetathareja can you please go over scenario 2 and 3 and if it make sense. I wanted to understand why |
8f1d4ea
to
7fa5133
Compare
❌ Gradle check result for 7fa5133: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
7fa5133
to
64bb954
Compare
I have updated the PR to adjust the cluster health configuration using only search replicas and to incorporate the changes made when |
❌ Gradle check result for 64bb954: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
64bb954
to
470c0ea
Compare
❌ Gradle check result for 470c0ea: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Adding @sachinpkale can you please take a look at this comment #17299 (comment) and provide your thoughts to understand why
I dint get much info from the docs https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/remote-store/index/#restoring-from-a-backup. |
|
||
@Override | ||
public List<Route> routes() { | ||
return asList(new Route(POST, "/{index}/_searchonly/enable"), new Route(POST, "/{index}/_searchonly/disable")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename _searchonly
better to have a verb instead to denote an action on an index like _scale
and use search-only
as a query parameter/request body to ensure the API finds wider applicability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks will take a look at this to go with a generic and which has a wider applicability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially I started with _scale
#16720 (comment). May be we can have ?
POST /{index}/_scale
{
"search-only": true
}
Adding @msfroh @mch2 @getsaurabh02
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per original discussion @prudhvigodithi _scale is more intuitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I have updated to use _scale
, example
curl -X POST "http://localhost:9200/my-index/_scale" \
-H "Content-Type: application/json" \
-d '{
"search_only": true
}'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately you'll need to make a corresponding set of changes in the action names and probably the class names for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will do that. I can change to RestScaleSearchOnlyAction
. Thanks
setting 'node.attr.remote_store.translog.repository', 'my-repository' | ||
setting 'node.attr.remote_store.repository.my-repository.type', 'fs' | ||
setting 'node.attr.remote_store.state.repository', 'my-repository' | ||
setting 'node.attr.remote_store.repository.my-repository.settings.location', '/tmp/my-repo' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used these settings to test the search only replicas with remote store as filesystem, will remove them and change to default once I have the green tests and finalized the PR review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should add a param to easily set these from command line with ./gradlew run.
34d8bd5
to
285fc87
Compare
❌ Gradle check result for 285fc87: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Prudhvi Godithi <[email protected]>
285fc87
to
cb515eb
Compare
❌ Gradle check result for cb515eb: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
setting 'node.attr.remote_store.translog.repository', 'my-repository' | ||
setting 'node.attr.remote_store.repository.my-repository.type', 'fs' | ||
setting 'node.attr.remote_store.state.repository', 'my-repository' | ||
setting 'node.attr.remote_store.repository.my-repository.settings.location', '/tmp/my-repo' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should add a param to easily set these from command line with ./gradlew run.
import static org.opensearch.test.hamcrest.OpenSearchAssertions.assertHitCount; | ||
|
||
@OpenSearchIntegTestCase.ClusterScope(scope = OpenSearchIntegTestCase.Scope.TEST, numDataNodes = 0) | ||
public class SearchOnlyIT extends RemoteStoreBaseIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick - we have a few tests with a generic "SearchOnly" name, can we narrow this naming to the Scale functionality.
totalSearchReplicas += shardTable.searchOnlyReplicas().stream().filter(ShardRouting::active).count(); | ||
} | ||
assertEquals("Expected 1 primary", 1, totalPrimaries); | ||
assertEquals("Expected 0 writer replicas", 1, totalWriterReplicas); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't have to make these assertions on the routing table as its implied by ensureGreen
that all desired shards are active.
assertEquals(1, shardTable.searchOnlyReplicas().stream().filter(ShardRouting::active).count()); | ||
|
||
// Scale down to search-only mode | ||
assertAcked(client().admin().indices().prepareSearchOnly(TEST_INDEX).setSearchOnly(true).get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prepareSearchOnly
API would need to be updated to use your Scale
naming.
|
||
// Verify we can still search | ||
SearchResponse searchResponse = client().prepareSearch(TEST_INDEX).get(); | ||
assertHitCount(searchResponse, 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great - though this still uses segrep with remote store, post indexing above we should add a wait condition that 10 docs are active to avoid flakiness? Or is it a condition of the scale
API and the block presence that the replica will be up to date before those are ack'd? If thats the case pls make a small comment.
@@ -94,6 +94,7 @@ | |||
import java.util.TreeSet; | |||
import java.util.function.Function; | |||
|
|||
import static org.opensearch.action.admin.indices.scale.searchonly.TransportSearchOnlyAction.INDEX_SEARCHONLY_BLOCK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets define the new block along with all the others here in IndexMetadata instead of the transport layer.
@@ -94,6 +96,7 @@ public class IndexRoutingTable extends AbstractDiffable<IndexRoutingTable> | |||
private final Map<Integer, IndexShardRoutingTable> shards; | |||
|
|||
private final List<ShardRouting> allActiveShards; | |||
protected final Logger logger = LogManager.getLogger(this.getClass()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why protected?
+ indexMetadata.getNumberOfSearchOnlyReplicas() | ||
+ "], got [" | ||
+ routingNumberOfReplicas | ||
+ "]" | ||
); | ||
} else if (routingNumberOfReplicas != expectedReplicas) { | ||
// Just log if there's a mismatch but the index is search-only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to log here if this is an expected state?
@@ -425,8 +442,33 @@ public Builder initializeAsNew(IndexMetadata indexMetadata) { | |||
/** | |||
* Initializes an existing index. | |||
*/ | |||
|
|||
public Builder initializeAsRecovery(IndexMetadata indexMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about hte other cases here?
initializeAsFromCloseToOpen
initializeAsFromDangling
initializeAsFromOpenToClose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this check need to be in initializeEmpty instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see initializeEmpty
is generically called across those methods, let me update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now after thinking It's logical that these operations would need different behavior than recovery?. When reopening or importing an index, we should typically want all shards (primaries, regular replicas, and search replicas) to be initialized properly, not just search replicas, WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking back yes we can move this logic to common method initializeEmpty
, but at this there is an issue with _close
API when search replicas are enabled #15306 (comment). For this PR I will move and keep the code in initializeEmpty
.
null | ||
|
||
if (mdFile == null) { | ||
listener.onResponse(new CheckpointInfoResponse(indexShard.getLatestReplicationCheckpoint(), Collections.emptyMap(), null)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is so that SRs can become active and sync with remote before a primary has uploaded anything correct? I don't think we want to make the same change for other replicas that rely on primary initialization to complete first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point let me handle a case for isSearchOnly
. Something like } else if (indexShard.routingEntry().isSearchOnly()) {
.
Description
The primary goal is to allow users to designate an index as search-only allowing only to have the search only replicas running when enabled via an API call
(can be disabled as_searchonly/enable
)._searchonly/disable
With
for an index the process has Two-Phase Scale-Down with a temporary block for the duration of the scale-down operation and then explicitly replace it with a permanent block once all prerequisites (e.g., shard sync, flush, metadata updates) have been met._searchonly/enable
From #17299 (comment) using
_scale
API withsearch_only
set to true or false.Eliminates the need for users to manually invoke the
_remotestore/_restore
API to recover search-only replicas with_searchonly/enable
, has automatic recovery of search-only replicas from the remote store during cluster recovery. The default behavior is still honored in normal conditions https://opensearch.org/docs/latest/tuning-your-cluster/availability-and-recovery/remote-store/index/#restoring-from-a-backup. More testing scenarios added here Search only replicas (scale to zero) with Reader/Writer Separation #17299 (comment).To Do (Work on the cluster health, coming from [META] Reader/Writer Separation #15306 (comment) add this similar implementation)This PR also has updated cluster health code Search only replicas (scale to zero) with Reader/Writer Separation #17299 (comment).Related Issues
#16720 and part of #15306
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.