-
Notifications
You must be signed in to change notification settings - Fork 817
Replace EnableNativeHistograms from TSDB config with PerTenant Limit #6718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace EnableNativeHistograms from TSDB config with PerTenant Limit #6718
Conversation
…nableNativeHistogramPerUser limit Signed-off-by: Paurush Garg <[email protected]>
@@ -204,7 +201,6 @@ func (cfg *TSDBConfig) RegisterFlags(f *flag.FlagSet) { | |||
f.IntVar(&cfg.MaxExemplars, "blocks-storage.tsdb.max-exemplars", 0, "Deprecated, use maxExemplars in limits instead. If the MaxExemplars value in limits is set to zero, cortex will fallback on this value. This setting enables support for exemplars in TSDB and sets the maximum number that will be stored. 0 or less means disabled.") | |||
f.BoolVar(&cfg.MemorySnapshotOnShutdown, "blocks-storage.tsdb.memory-snapshot-on-shutdown", false, "True to enable snapshotting of in-memory TSDB data on disk when shutting down.") | |||
f.Int64Var(&cfg.OutOfOrderCapMax, "blocks-storage.tsdb.out-of-order-cap-max", tsdb.DefaultOutOfOrderCapMax, "[EXPERIMENTAL] Configures the maximum number of samples per chunk that can be out-of-order.") | |||
f.BoolVar(&cfg.EnableNativeHistograms, "blocks-storage.tsdb.enable-native-histograms", false, "[EXPERIMENTAL] True to enable native histogram.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still a breaking change if we keep the configuration name the same but move it to per tenant runtime config? I guess we don't have to rename it just to add _per_user
suffix
That way existing users won't be impacted as it is allowed to be set globally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean keep the same CLI flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It is still kind of breaking as it moves from tsdb
to limit section in the config file. But I guess it is fine for an experimental feature as you said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Updated to keep the same config name.
pkg/util/validation/limits.go
Outdated
@@ -257,6 +258,7 @@ func (l *Limits) RegisterFlags(f *flag.FlagSet) { | |||
f.IntVar(&l.MaxLocalSeriesPerMetric, "ingester.max-series-per-metric", 50000, "The maximum number of active series per metric name, per ingester. 0 to disable.") | |||
f.IntVar(&l.MaxGlobalSeriesPerUser, "ingester.max-global-series-per-user", 0, "The maximum number of active series per user, across the cluster before replication. 0 to disable. Supported only if -distributor.shard-by-all-labels is true.") | |||
f.IntVar(&l.MaxGlobalSeriesPerMetric, "ingester.max-global-series-per-metric", 0, "The maximum number of active series per metric name, across the cluster before replication. 0 to disable.") | |||
f.BoolVar(&l.EnableNativeHistogramPerUser, "ingester.enable_native_histogram_per_user", false, "Flag to enable NativeHistograms per user.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the Experimental tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added experimental
tag.
pkg/ingester/ingester.go
Outdated
EnableNativeHistograms: i.cfg.BlocksStorageConfig.TSDB.EnableNativeHistograms, | ||
EnableOOONativeHistograms: true, | ||
EnableOverlappingCompaction: false, // Always let compactors handle overlapped blocks, e.g. OOO blocks. | ||
EnableNativeHistograms: true, // Always enable Native Histograms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain here that the gate keeping is done though a per tenant config at ingestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added.
pkg/ingester/ingester.go
Outdated
@@ -1347,7 +1347,7 @@ func (i *Ingester) Push(ctx context.Context, req *cortexpb.WriteRequest) (*corte | |||
return nil, wrapWithUser(err, userID) | |||
} | |||
|
|||
if i.cfg.BlocksStorageConfig.TSDB.EnableNativeHistograms { | |||
if i.limits.EnableNativeHistogramPerUser(userID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to throw an exception here similar to Prometheus? Wdyt?
https://github.com/prometheus/prometheus/blob/main/tsdb/head_append.go#L647
Try ingesting some native histogram sample into current version of cortex with the NH feature disabled and see what is the behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think current Cortex behaviour is that it count it as discarded sample here and not throw error here. But I m still checking by running Cortex.
Signed-off-by: Paurush Garg <[email protected]>
Signed-off-by: Paurush Garg <[email protected]>
…ng config name Signed-off-by: Paurush Garg <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -6839,7 +6850,7 @@ func TestIngester_UpdateLabelSetMetrics(t *testing.T) { | |||
require.NoError(t, os.Mkdir(chunksDir, os.ModePerm)) | |||
require.NoError(t, os.Mkdir(blocksDir, os.ModePerm)) | |||
|
|||
i, err := prepareIngesterWithBlocksStorageAndLimits(t, cfg, limits, tenantLimits, blocksDir, reg, false) | |||
i, err := prepareIngesterWithBlocksStorageAndLimits(t, cfg, limits, tenantLimits, blocksDir, reg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test case for when native histogram is disabled and push should increment the samples discarded metrics?
If not, can you add one to TestIngester_Push
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is awesome!
Replace EnableNativeHistograms from TSDB config with EnableNativeHistogramPerUser Limit
What this PR does:
Which issue(s) this PR fixes:
Fixes #
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]