Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate experimental-snapshot-catchup-entries flag to snapshot-catchup-entries #19352

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 26 additions & 5 deletions server/embed/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ var (
"experimental-bootstrap-defrag-threshold-megabytes": "bootstrap-defrag-threshold-megabytes",
"experimental-max-learners": "max-learners",
"experimental-memory-mlock": "memory-mlock",
"experimental-snapshot-catchup-entries": "snapshot-catchup-entries",
"experimental-compaction-sleep-interval": "compaction-sleep-interval",
"experimental-downgrade-check-time": "downgrade-check-time",
}
Expand Down Expand Up @@ -185,12 +186,21 @@ type Config struct {
// TODO: remove it in 3.7.
SnapshotCount uint64 `json:"snapshot-count"`

// SnapshotCatchUpEntries is the number of entries for a slow follower
// ExperimentalSnapshotCatchUpEntries is the number of entries for a slow follower
// to catch-up after compacting the raft storage entries.
// We expect the follower has a millisecond level latency with the leader.
// The max throughput is around 10K. Keep a 5K entries is enough for helping
// follower to catch up.
SnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
Comment on lines +194 to +196
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
// Deprecated in v3.6 and will be removed in v3.7.
// TODO: remove in v3.7.
// Note we made a mistake in https://github.com/etcd-io/etcd/pull/15033. The json tag
// `*-catch-up-*` isn't consistent with the command line flag `*-catchup-*`.
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls feel free to address this comment separately in release-3.6 (will be be cut later today my time) only if you want


// SnapshotCatchUpEntries is the number of entires for a slow follower
// to catch-up after compacting the raft storage entries.
// We expect the follower has a millisecond level latency with the leader.
// The max throughput is around 10K. Keep a 5K entries is enough for helping
// follower to catch up.
SnapshotCatchUpEntries uint64 `json:"snapshot-catchup-entries"`

// MaxSnapFiles is deprecated in v3.6 and will be decommissioned in v3.7.
// TODO: remove it in 3.7.
Expand Down Expand Up @@ -584,8 +594,9 @@ func NewConfig() *Config {

Name: DefaultName,

SnapshotCount: etcdserver.DefaultSnapshotCount,
SnapshotCatchUpEntries: etcdserver.DefaultSnapshotCatchUpEntries,
SnapshotCount: etcdserver.DefaultSnapshotCount,
ExperimentalSnapshotCatchUpEntries: etcdserver.DefaultSnapshotCatchUpEntries,
SnapshotCatchUpEntries: etcdserver.DefaultSnapshotCatchUpEntries,

MaxTxnOps: DefaultMaxTxnOps,
MaxRequestBytes: DefaultMaxRequestBytes,
Expand Down Expand Up @@ -876,7 +887,8 @@ func (cfg *Config) AddFlags(fs *flag.FlagSet) {
// TODO: delete in v3.7
fs.IntVar(&cfg.ExperimentalMaxLearners, "experimental-max-learners", membership.DefaultMaxLearners, "Sets the maximum number of learners that can be available in the cluster membership. Deprecated in v3.6 and will be decommissioned in v3.7. Use --max-learners instead.")
fs.IntVar(&cfg.MaxLearners, "max-learners", membership.DefaultMaxLearners, "Sets the maximum number of learners that can be available in the cluster membership.")
fs.Uint64Var(&cfg.SnapshotCatchUpEntries, "experimental-snapshot-catchup-entries", cfg.SnapshotCatchUpEntries, "Number of entries for a slow follower to catch up after compacting the raft storage entries.")
fs.Uint64Var(&cfg.ExperimentalSnapshotCatchUpEntries, "experimental-snapshot-catchup-entries", cfg.ExperimentalSnapshotCatchUpEntries, "Number of entries for a slow follower to catch up after compacting the raft storage entries. Deprecated in v3.6 and will be decommissioned in v3.7. Use --snapshot-catchup-entries instead.")
fs.Uint64Var(&cfg.SnapshotCatchUpEntries, "snapshot-catchup-entries", cfg.SnapshotCatchUpEntries, "Number of entries for a slow follower to catch up after compacting the raft storage entries.")

// unsafe
fs.BoolVar(&cfg.UnsafeNoFsync, "unsafe-no-fsync", false, "Disables fsync, unsafe, will cause data loss.")
Expand Down Expand Up @@ -925,6 +937,15 @@ func (cfg *configYAML) configFromFile(path string) error {
cfg.FlagsExplicitlySet[flg] = true
}

// attempt to fix a bug introduced in https://github.com/etcd-io/etcd/pull/15033
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch.

// both `experimental-snapshot-catch-up-entries` and `experimental-snapshot-catchup-entries` refer to the same field,
// map the YAML field "experimental-snapshot-catch-up-entries" to the flag "experimental-snapshot-catchup-entries".
if val, ok := cfgMap["experimental-snapshot-catch-up-entries"]; ok {
cfgMap["experimental-snapshot-catchup-entries"] = val
cfg.ExperimentalSnapshotCatchUpEntries = uint64(val.(float64))
cfg.FlagsExplicitlySet["experimental-snapshot-catchup-entries"] = true
}

getBoolFlagVal := func(flagName string) *bool {
flagVal, ok := cfgMap[flagName]
if !ok {
Expand Down
5 changes: 5 additions & 0 deletions server/etcdmain/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ var (
"experimental-bootstrap-defrag-threshold-megabytes": "--experimental-bootstrap-defrag-threshold-megabytes is deprecated in v3.6 and will be decommissioned in v3.7. Use '--bootstrap-defrag-threshold-megabytes' instead.",
"experimental-max-learners": "--experimental-max-learners is deprecated in v3.6 and will be decommissioned in v3.7. Use '--max-learners' instead.",
"experimental-memory-mlock": "--experimental-memory-mlock is deprecated in v3.6 and will be decommissioned in v3.7. Use '--memory-mlock' instead.",
"experimental-snapshot-catchup-entries": "--experimental-snapshot-catchup-entries is deprecated in v3.6 and will be decommissioned in v3.7. Use '--snapshot-catchup-entries' instead.",
"experimental-compaction-sleep-interval": "--experimental-compaction-sleep-interval is deprecated in v3.6 and will be decommissioned in v3.7. Use 'compaction-sleep-interval' instead.",
"experimental-downgrade-check-time": "--experimental-downgrade-check-time is deprecated in v3.6 and will be decommissioned in v3.7. Use '--downgrade-check-time' instead.",
}
Expand Down Expand Up @@ -211,6 +212,10 @@ func (cfg *config) parse(arguments []string) error {
cfg.ec.MemoryMlock = cfg.ec.ExperimentalMemoryMlock
}

if cfg.ec.FlagsExplicitlySet["experimental-snapshot-catchup-entries"] {
cfg.ec.SnapshotCatchUpEntries = cfg.ec.ExperimentalSnapshotCatchUpEntries
}

if cfg.ec.FlagsExplicitlySet["experimental-compaction-sleep-interval"] {
cfg.ec.CompactionSleepInterval = cfg.ec.ExperimentalCompactionSleepInterval
}
Expand Down
73 changes: 72 additions & 1 deletion server/etcdmain/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import (
"go.etcd.io/etcd/pkg/v3/featuregate"
"go.etcd.io/etcd/pkg/v3/flags"
"go.etcd.io/etcd/server/v3/embed"
"go.etcd.io/etcd/server/v3/etcdserver"
"go.etcd.io/etcd/server/v3/features"
)

Expand Down Expand Up @@ -66,7 +67,7 @@ func TestConfigFileMemberFields(t *testing.T) {
MaxWALFiles uint `json:"max-wals"`
Name string `json:"name"`
SnapshotCount uint64 `json:"snapshot-count"`
SnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries"`
SnapshotCatchUpEntries uint64 `json:"snapshot-catchup-entries"`
ListenPeerURLs string `json:"listen-peer-urls"`
ListenClientURLs string `json:"listen-client-urls"`
ListenClientHTTPURLs string `json:"listen-client-http-urls"`
Expand Down Expand Up @@ -320,6 +321,73 @@ func TestConfigParsingMissedAdvertiseClientURLsFlag(t *testing.T) {
}
}

// TestExperimentalSnapshotCatchUpEntriesFlagMigration tests the migration from
// --experimental-snapshot-catch-up-entries to --snapshot-catch-up-entries
func TestExperimentalSnapshotCatchUpEntriesFlagMigration(t *testing.T) {
testCases := []struct {
name string
snapshotCatchUpEntries uint64
experimentalSnapshotCatchUpEntries uint64
wantErr bool
wantConfig uint64
}{
{
name: "default",
wantConfig: etcdserver.DefaultSnapshotCatchUpEntries,
},
{
name: "cannot set both experimental flag and non experimental flag",
experimentalSnapshotCatchUpEntries: 1000,
snapshotCatchUpEntries: 2000,
wantErr: true,
},
{
name: "can set experimental flag",
experimentalSnapshotCatchUpEntries: 1000,
wantConfig: 1000,
},
{
name: "can set non-experimental flag",
snapshotCatchUpEntries: 2000,
wantConfig: 2000,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
cmdLineArgs := []string{}
yc := struct {
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries,omitempty"`
SnapshotCatchUpEntries uint64 `json:"snapshot-catchup-entries,omitempty"`
}{}

if tc.snapshotCatchUpEntries > 0 {
cmdLineArgs = append(cmdLineArgs, fmt.Sprintf("--snapshot-catchup-entries=%d", tc.snapshotCatchUpEntries))
yc.SnapshotCatchUpEntries = tc.snapshotCatchUpEntries
}

if tc.experimentalSnapshotCatchUpEntries > 0 {
cmdLineArgs = append(cmdLineArgs, fmt.Sprintf("--experimental-snapshot-catchup-entries=%d", tc.experimentalSnapshotCatchUpEntries))
yc.ExperimentalSnapshotCatchUpEntries = tc.experimentalSnapshotCatchUpEntries
}

cfgFromCmdLine, errFromCmdLine, cfgFromFile, errFromFile := generateCfgsFromFileAndCmdLine(t, yc, cmdLineArgs)

if tc.wantErr {
if errFromCmdLine == nil || errFromFile == nil {
t.Fatal("expect parse error")
}
return
}
if errFromCmdLine != nil || errFromFile != nil {
t.Fatal("error parsing config")
}

require.Equal(t, tc.wantConfig, cfgFromCmdLine.ec.SnapshotCatchUpEntries)
require.Equal(t, tc.wantConfig, cfgFromFile.ec.SnapshotCatchUpEntries)
})
}
}

func TestConfigIsNewCluster(t *testing.T) {
tests := []struct {
state string
Expand Down Expand Up @@ -1273,6 +1341,7 @@ func TestConfigFileDeprecatedOptions(t *testing.T) {
ExperimentalWarningApplyDuration time.Duration `json:"experimental-warning-apply-duration,omitempty"`
ExperimentalBootstrapDefragThresholdMegabytes uint `json:"experimental-bootstrap-defrag-threshold-megabytes,omitempty"`
ExperimentalMaxLearners int `json:"experimental-max-learners,omitempty"`
ExperimentalSnapshotCatchUpEntries uint64 `json:"experimental-snapshot-catch-up-entries,omitempty"`
ExperimentalCompactionSleepInterval time.Duration `json:"experimental-compaction-sleep-interval,omitempty"`
ExperimentalDowngradeCheckTime time.Duration `json:"experimental-downgrade-check-time,omitempty"`
}
Expand All @@ -1299,6 +1368,7 @@ func TestConfigFileDeprecatedOptions(t *testing.T) {
ExperimentalWarningApplyDuration: 3 * time.Minute,
ExperimentalBootstrapDefragThresholdMegabytes: 100,
ExperimentalMaxLearners: 1,
ExperimentalSnapshotCatchUpEntries: 1000,
ExperimentalCompactionSleepInterval: 30 * time.Second,
ExperimentalDowngradeCheckTime: 1 * time.Minute,
},
Expand All @@ -1311,6 +1381,7 @@ func TestConfigFileDeprecatedOptions(t *testing.T) {
"experimental-warning-apply-duration": {},
"experimental-bootstrap-defrag-threshold-megabytes": {},
"experimental-max-learners": {},
"experimental-snapshot-catchup-entries": {},
"experimental-compaction-sleep-interval": {},
"experimental-downgrade-check-time": {},
},
Expand Down
2 changes: 2 additions & 0 deletions server/etcdmain/help.go
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,8 @@ Experimental feature:
--experimental-memory-mlock
Enable to enforce etcd pages (in particular bbolt) to stay in RAM. Deprecated in v3.6 and will be decommissioned in v3.7. Use '--memory-mlock' instead.
--experimental-snapshot-catchup-entries
Number of entries for a slow follower to catch up after compacting the raft storage entries. Deprecated in v3.6 and will be decommissioned in v3.7. Use '--snapshot-catchup-entries' instead.
--snapshot-catchup-entries
Number of entries for a slow follower to catch up after compacting the raft storage entries.
--experimental-stop-grpc-service-on-defrag
Enable etcd gRPC service to stop serving client requests on defragmentation. It's deprecated, and will be decommissioned in v3.7. Use '--feature-gates=StopGRPCServiceOnDefrag=true' instead.
Expand Down
4 changes: 2 additions & 2 deletions tests/framework/e2e/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ func WithSnapshotCount(count uint64) EPClusterOption {
}

func WithSnapshotCatchUpEntries(count uint64) EPClusterOption {
return func(c *EtcdProcessClusterConfig) { c.ServerConfig.SnapshotCatchUpEntries = count }
return func(c *EtcdProcessClusterConfig) { c.ServerConfig.ExperimentalSnapshotCatchUpEntries = count }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The e2e test failure of TestRecoverSnapshotBackend is caused by this change. You need to update the line below to use the ExperimentalSnapshotCatchUpEntries

return config.ServerConfig.SnapshotCount + config.ServerConfig.SnapshotCatchUpEntries

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it might be even better to set both SnapshotCatchUpEntries and ExperimentalSnapshotCatchUpEntries in this function WithSnapshotCatchUpEntries; then no need to update the line below,

return config.ServerConfig.SnapshotCount + config.ServerConfig.SnapshotCatchUpEntries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review, @ahrtr. I considered setting both the flags at test framework, but we explicitly disallow doing so (we validate for both flags being set), as that could allow misconfigurations in real world use-cases.

Copy link
Member

@ahrtr ahrtr Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware of that setting both flags are disallowed, but what we are talking about here is just test code.

Also note that the robustness test cases test both main and release branches, while ExperimentalSnapshotCatchUpEntries doesn't exist in stable release branches, so setting both fields here might be best choice.

}

func WithClusterSize(size int) EPClusterOption {
Expand Down Expand Up @@ -661,7 +661,7 @@ func (cfg *EtcdProcessClusterConfig) EtcdServerProcessConfig(tb testing.TB, i in
if defaultValue := defaultValues[flag]; value == "" || value == defaultValue {
continue
}
if flag == "experimental-snapshot-catchup-entries" && !CouldSetSnapshotCatchupEntries(execPath) {
if strings.HasSuffix(flag, "snapshot-catchup-entries") && !CouldSetSnapshotCatchupEntries(execPath) {
continue
}
args = append(args, fmt.Sprintf("--%s=%s", flag, value))
Expand Down
6 changes: 3 additions & 3 deletions tests/framework/e2e/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,22 +81,22 @@ func TestEtcdServerProcessConfig(t *testing.T) {
name: "CatchUpEntries",
config: NewConfig(WithSnapshotCatchUpEntries(100)),
expectArgsContain: []string{
"--experimental-snapshot-catchup-entries=100",
"--experimental-snapshot-catchup-entries=100", // this is needed as robust-ness tests still using experimental flag
},
mockBinaryVersion: &v3_5_14,
},
{
name: "CatchUpEntriesNoVersion",
config: NewConfig(WithSnapshotCatchUpEntries(100), WithVersion(LastVersion)),
expectArgsNotContain: []string{
"--experimental-snapshot-catchup-entries=100",
"--experimental-snapshot-catchup-entries=100", // this is needed as robust-ness tests still using experimental flag
},
},
{
name: "CatchUpEntriesOldVersion",
config: NewConfig(WithSnapshotCatchUpEntries(100), WithVersion(LastVersion)),
expectArgsNotContain: []string{
"--experimental-snapshot-catchup-entries=100",
"--experimental-snapshot-catchup-entries=100", // this is needed as robust-ness tests still using experimental flag
},
mockBinaryVersion: &v3_5_12,
},
Expand Down