Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate experimental-snapshot-catchup-entries flag to snapshot-catchup-entries #19352

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ajaysundark
Copy link
Contributor

Migrate experimental-snapshot-catchup-entries flag to snapshot-catchup-entries

Sub task from #19141

Note:

  1. This PR also handles a fix to an unintended bug from this feature PR, where the field-name was set as experimental-snapshot-catch-up-entries when cmd-line flag experimental-snapshot-catchup-entries was used.
  2. YAML field experimental-snapshot-catch-up-entries will set flag experimental-snapshot-catchup-entries correctly at FlagsExplicitlySet. This is required to mark / identify the field if used from 'config-file' as a deprecated flag usage.
  3. experimental-snapshot-catch-up-entries YAML field usage is maintained the same field-name for backward compability.

@k8s-ci-robot
Copy link

Hi @ajaysundark. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ajaysundark
Copy link
Contributor Author

/cc @siyuanfoundation @ahrtr

Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.83%. Comparing base (8e0ea34) to head (30f96c0).
Report is 12 commits behind head on main.

Additional details and impacted files
Files with missing lines Coverage Δ
server/embed/config.go 79.96% <100.00%> (+0.18%) ⬆️
server/etcdmain/config.go 74.41% <100.00%> (+0.30%) ⬆️

... and 24 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19352      +/-   ##
==========================================
- Coverage   68.90%   68.83%   -0.08%     
==========================================
  Files         420      420              
  Lines       35723    35731       +8     
==========================================
- Hits        24616    24596      -20     
- Misses       9677     9712      +35     
+ Partials     1430     1423       -7     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8e0ea34...30f96c0. Read the comment docs.

@ahrtr
Copy link
Member

ahrtr commented Feb 6, 2025

/ok-to-test

@@ -910,6 +922,15 @@ func (cfg *configYAML) configFromFile(path string) error {
cfg.FlagsExplicitlySet[flg] = true
}

// attempt to fix a bug introduced in https://github.com/etcd-io/etcd/pull/15033
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ajaysundark, siyuanfoundation
Once this PR has been reviewed and has the lgtm label, please assign serathius for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@siyuanfoundation
Copy link
Contributor

siyuanfoundation commented Feb 6, 2025

Need to change in tests/framework/e2e/cluster.go

func WithSnapshotCatchUpEntries(count uint64) EPClusterOption {
--	return func(c *EtcdProcessClusterConfig) { c.ServerConfig.SnapshotCatchUpEntries = count }
++     return func(c *EtcdProcessClusterConfig) { c.ServerConfig.ExperimentalSnapshotCatchUpEntries = count }
}

to keep using the --experimental-snapshot-catchup-entries in e2e and robustness test.

We do not want to use the non-experimental flag yet in those tests because of
Opened kubernetes/kubernetes#125031

@siyuanfoundation siyuanfoundation self-requested a review February 6, 2025 17:25
@ajaysundark ajaysundark force-pushed the migrate-experimental-snapshot-catch-up-entries branch from da2d4d8 to 4605e40 Compare February 6, 2025 19:06
@ajaysundark ajaysundark force-pushed the migrate-experimental-snapshot-catch-up-entries branch 2 times, most recently from d39df53 to 09c560c Compare February 6, 2025 19:39
@ajaysundark ajaysundark force-pushed the migrate-experimental-snapshot-catch-up-entries branch from d11c45d to 30f96c0 Compare February 6, 2025 19:57
@siyuanfoundation
Copy link
Contributor

/retest

@ajaysundark
Copy link
Contributor Author

This is blocked by #19354 as we have test framework dependency on the ExperimentalSnapshotCatchupEntries.

For example:

TestMixVersionsSnapshotByMockingPartition would need ExperimentalSnapshotCatchupEntries to be set here: framework/e2e/cluster.go#L232 whereas TestRecoverSnapshotBackend require this to be SnapshotCatchupEntries.

@@ -910,6 +922,15 @@ func (cfg *configYAML) configFromFile(path string) error {
cfg.FlagsExplicitlySet[flg] = true
}

// attempt to fix a bug introduced in https://github.com/etcd-io/etcd/pull/15033
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch.

Comment on lines +194 to +196
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
// Note we made a mistake in https://github.com/etcd-io/etcd/pull/15033. The json tag
// `*-catch-up-*` isn't consistent with the command line flag `*-catchup-*`.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls feel free to address this comment separately in release-3.6 (will be be cut later today my time) only if you want

@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2025

/retest

@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2025

/test pull-etcd-e2e-amd64

@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2025

/test pull-etcd-e2e-amd64

@k8s-ci-robot
Copy link

@ajaysundark: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-e2e-amd64 30f96c0 link true /test pull-etcd-e2e-amd64

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@@ -229,7 +229,7 @@ func WithSnapshotCount(count uint64) EPClusterOption {
}

func WithSnapshotCatchUpEntries(count uint64) EPClusterOption {
return func(c *EtcdProcessClusterConfig) { c.ServerConfig.SnapshotCatchUpEntries = count }
return func(c *EtcdProcessClusterConfig) { c.ServerConfig.ExperimentalSnapshotCatchUpEntries = count }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The e2e test failure of TestRecoverSnapshotBackend is caused by this change. You need to update the line below to use the ExperimentalSnapshotCatchUpEntries

return config.ServerConfig.SnapshotCount + config.ServerConfig.SnapshotCatchUpEntries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it might be even better to set both SnapshotCatchUpEntries and ExperimentalSnapshotCatchUpEntries in this function WithSnapshotCatchUpEntries; then no need to update the line below,

return config.ServerConfig.SnapshotCount + config.ServerConfig.SnapshotCatchUpEntries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review, @ahrtr. I considered setting both the flags at test framework, but we explicitly disallow doing so (we validate for both flags being set), as that could allow misconfigurations in real world use-cases.

Copy link
Member

@ahrtr ahrtr Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of that setting both flags are disallowed, but what we are talking about here is just test code.

Also note that the robustness test cases test both main and release branches, while ExperimentalSnapshotCatchUpEntries doesn't exist in stable release branches, so setting both fields here might be best choice.

@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2025

Note that I will cut branch release-3.6 about 2 hours later. If the comments can be resolved before that time, then it should be able to be merged before cutting branch; otherwise, you will have to resubmit the PR against the new release-3.6.

@ahrtr
Copy link
Member

ahrtr commented Feb 7, 2025

otherwise, you will have to resubmit the PR against the new release-3.6.

This might not be correct. We can still merge this to main, then backport to 3.6. Eventually we will cleanup all experimental flags in main.

@k8s-ci-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants