You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
119140: pkg/sql: export sql.aggregated_livebytes metric for tenants r=jaylim-crl a=jaylim-crl
#### pkg/util/metric: support metrics removal from the metrics registry
Previously, once a metric has been added to the metrics registry, it will
always be registered forever, and there isn't a mechanism to remove it. For
multi-tenancy, we plan to implement a job that exports global metrics for
tenants (i.e. such metrics should only exist on one SQL node at any point in
time). Given that jobs can be cancelled and resumed on a different SQL node,
the only option to support such a behavior is to remove metrics from the
registry when the job is no longer running, and this commit adds such support
to it.
Epic: none
Release note: None
#### pkg/server: add ApproximateTotalStats field to SpanStats proto message
Previously, the SpanStats proto message only kept track of the logical MVCC
stats in the TotalStats field. This is insufficient for the work that exposes
the aggregated livebytes as a metric for tenants as the metric value needs to
take into account all replicas for a given range. To address that, this commit
adds a new ApproximateTotalStats field to the SpanStats proto message, and it
represents post-replicated MVCC stats for the span.
Epic: none
Release note: None
#### pkg/sql: export sql.aggregated_livebytes metric for out-of-process tenants
Previously, in order to obtain livebytes metrics for tenants, one would need
to query such values via the KV servers, and this can be problematic if we
only have access to just the SQL servers. For example, in CockroachDB Cloud,
only metrics from the SQL servers are exported to end-users, and is done so
directly from the cockroachdb process. It is not trivial to export an
additional subset of metrics from the KV servers filtered by tenant ID.
To address that, this commit exposes livebytes for tenants directly via an
aggregated metric on the SQL nodes. The aggregated metric will be updated
every 60 seconds by default, and will be exported via the existing MVCC
statistics update job. Unlike other job metrics where metrics are registered
at initialization time and stays forever, this aggregated metric is tied to
the lifespan of the job (i.e. it is only exported if the job is running, and
unexported otherwise).
This feature is scoped to standalone SQL servers only, which at this point of
writing, is only supported in CockroachDB Cloud. If we wanted to backport this
into 23.2, it should be straightforward as well since the permanent upgrade
to insert the job is already in release-23.2.
Fixes: #119139
Epic: none
Release note (sql change): Out-of-process SQL servers will start exporting a
new sql.aggregated_livebytes metric. This metric gets updated once every 60
seconds by default, and its update interval can be configured via the
`tenant_global_metrics_exporter_interval` cluster setting.
Co-authored-by: Jay <[email protected]>
0 commit comments