diff --git a/src/current/_includes/v25.1/essential-alerts.md b/src/current/_includes/v25.1/essential-alerts.md
index f71d81dd2f5..dd311c760c6 100644
--- a/src/current/_includes/v25.1/essential-alerts.md
+++ b/src/current/_includes/v25.1/essential-alerts.md
@@ -24,7 +24,7 @@ A node with a high CPU utilization, an *overloaded* node, has a limited ability
- A persistently high CPU utilization of all nodes in a CockroachDB cluster suggests the current compute resources may be insufficient to support the user workload's concurrency requirements. If confirmed, the number of processors (vCPUs or cores) in the CockroachDB cluster needs to be adjusted to sustain the required level of workload concurrency. For a prompt resolution, either add cluster nodes or throttle the workload concurrency, for example, by reducing the number of concurrent connections to not exceed 4 active statements per vCPU or core.
-### Hot node (hot spot)
+### Hot node (hotspot)
Unbalanced utilization of CockroachDB nodes in a cluster may negatively affect the cluster's performance and stability, with some nodes getting overloaded while others remain relatively underutilized.
@@ -38,7 +38,7 @@ Unbalanced utilization of CockroachDB nodes in a cluster may negatively affect t
**Action**
-- Refer to [Hot spots]({% link {{ page.version.version }}/performance-recipes.md %}#hot-spots).
+- Refer to [Understand hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
### Node memory utilization
diff --git a/src/current/_includes/v25.1/essential-metrics.md b/src/current/_includes/v25.1/essential-metrics.md
index 41039d31a75..a41c8768d3b 100644
--- a/src/current/_includes/v25.1/essential-metrics.md
+++ b/src/current/_includes/v25.1/essential-metrics.md
@@ -99,7 +99,7 @@ The **Usage** column explains why each metric is important to visualize in a cus
| ranges.underreplicated | ranges.underreplicated | Number of ranges with fewer live replicas than the replication target | This metric is an indicator of [replication issues]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#replication-issues). It shows whether the cluster has data that is not conforming to resilience goals. The next step is to determine the corresponding database object, such as the table or index, of these under-replicated ranges and whether the under-replication is temporarily expected. Use the statement `SELECT table_name, index_name FROM [SHOW RANGES WITH INDEXES] WHERE range_id = {id of under-replicated range};`|
| ranges.unavailable | ranges.unavailable | Number of ranges with fewer live replicas than needed for quorum | This metric is an indicator of [replication issues]({% link {{ page.version.version }}/cluster-setup-troubleshooting.md %}#replication-issues). It shows whether the cluster is unhealthy and can impact workload. If an entire range is unavailable, then it will be unable to process queries. |
| queue.replicate.replacedecommissioningreplica.error | {% if include.deployment == 'self-hosted' %}queue.replicate.replacedecommissioningreplica.error.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of failed decommissioning replica replacements processed by the replicate queue | Refer to [Decommission the node]({% link {{ page.version.version }}/node-shutdown.md %}?filters=decommission#decommission-the-node). |
-| range.splits | {% if include.deployment == 'self-hosted' %}range.splits.total |{% elsif include.deployment == 'advanced' %}range.splits |{% endif %} Number of range splits | This metric indicates how fast a workload is scaling up. Spikes can indicate resource hot spots since the [split heuristic is based on QPS]({% link {{ page.version.version }}/load-based-splitting.md %}#control-load-based-splitting-threshold). To understand whether hot spots are an issue and with which tables and indexes they are occurring, correlate this metric with other metrics such as CPU usage, such as `sys.cpu.combined.percent-normalized`, or use the [**Hot Ranges** page]({% link {{ page.version.version }}/ui-hot-ranges-page.md %}). |
+| range.splits | {% if include.deployment == 'self-hosted' %}range.splits.total |{% elsif include.deployment == 'advanced' %}range.splits |{% endif %} Number of range splits | This metric indicates how fast a workload is scaling up. Spikes can indicate resource [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) since the [split heuristic is based on QPS]({% link {{ page.version.version }}/load-based-splitting.md %}#control-load-based-splitting-threshold). To understand whether hotspots are an issue and with which tables and indexes they are occurring, correlate this metric with other metrics such as CPU usage, such as `sys.cpu.combined.percent-normalized`, or use the [**Hot Ranges** page]({% link {{ page.version.version }}/ui-hot-ranges-page.md %}). |
| range.merges | {% if include.deployment == 'self-hosted' %}range.merges.count |{% elsif include.deployment == 'advanced' %}NOT AVAILABLE |{% endif %} Number of range merges | This metric indicates how fast a workload is scaling down. Merges are Cockroach's [optimization for performance](architecture/distribution-layer.html#range-merges). This metric indicates that there have been deletes in the workload. |
## SQL
diff --git a/src/current/_includes/v25.1/performance/reduce-hot-spots.md b/src/current/_includes/v25.1/performance/reduce-hotspots.md
similarity index 92%
rename from src/current/_includes/v25.1/performance/reduce-hot-spots.md
rename to src/current/_includes/v25.1/performance/reduce-hotspots.md
index 4d7b601e33d..799fed761b8 100644
--- a/src/current/_includes/v25.1/performance/reduce-hot-spots.md
+++ b/src/current/_includes/v25.1/performance/reduce-hotspots.md
@@ -5,7 +5,7 @@
- Benefits of increasing normalization:
- Can improve performance for write-heavy workloads. This is because, with increased normalization, a given business fact must be written to one place rather than to multiple places.
- - Allows separate transactions to modify related underlying data without causing [contention](#transaction-contention).
+ - Allows separate transactions to modify related underlying data without causing [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention).
- Reduces the chance of data inconsistency, since a given business fact must be written only to one place.
- Reduces or eliminates data redundancy.
- Uses less disk space.
@@ -24,9 +24,9 @@
- If you are working with a table that **must** be indexed on sequential keys, consider using [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}). For details about the mechanics and performance improvements of hash-sharded indexes in CockroachDB, see the blog post [Hash Sharded Indexes Unlock Linear Scaling for Sequential Workloads](https://www.cockroachlabs.com/blog/hash-sharded-indexes-unlock-linear-scaling-for-sequential-workloads/). As part of this, we recommend doing thorough performance testing with and without hash-sharded indexes to see which works best for your application.
-- To avoid read hot spots:
+- To avoid read hotspots:
- - Increase data distribution, which will allow for more ranges. The hot spot exists because the data being accessed is all co-located in one range.
+ - Increase data distribution, which will allow for more ranges. The hotspot exists because the data being accessed is all co-located in one range.
- Increase [load balancing]({% link {{ page.version.version }}/recommended-production-settings.md %}#load-balancing) across more nodes in the same range. Most transactional reads must go to the leaseholder in CockroachDB, which means that opportunities for load balancing over replicas are minimal.
However, the following features do permit load balancing over replicas:
diff --git a/src/current/_includes/v25.1/performance/use-hash-sharded-indexes.md b/src/current/_includes/v25.1/performance/use-hash-sharded-indexes.md
index ca6132d8de6..314b0c24f5f 100644
--- a/src/current/_includes/v25.1/performance/use-hash-sharded-indexes.md
+++ b/src/current/_includes/v25.1/performance/use-hash-sharded-indexes.md
@@ -1 +1 @@
-We [discourage indexing on sequential keys]({% link {{ page.version.version }}/schema-design-indexes.md %}#best-practices). If a table **must** be indexed on sequential keys, use [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}). Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) and improving write performance on sequentially-keyed indexes at a small cost to read performance.
\ No newline at end of file
+We [discourage indexing on sequential keys]({% link {{ page.version.version }}/schema-design-indexes.md %}#best-practices). If a table **must** be indexed on sequential keys, use [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}). Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) and improving write performance on sequentially-keyed indexes at a small cost to read performance.
\ No newline at end of file
diff --git a/src/current/_includes/v25.1/sql/range-splits.md b/src/current/_includes/v25.1/sql/range-splits.md
index a612774afc0..be16d064f5d 100644
--- a/src/current/_includes/v25.1/sql/range-splits.md
+++ b/src/current/_includes/v25.1/sql/range-splits.md
@@ -2,6 +2,6 @@ CockroachDB breaks data into ranges. By default, CockroachDB attempts to keep ra
However, there are reasons why you may want to perform manual splits on the ranges that store tables or indexes:
-- When a table only consists of a single range, all writes and reads to the table will be served by that range's [leaseholder]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases). If a table only holds a small amount of data but is serving a large amount of traffic, load distribution can become unbalanced and a [hot spot]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) can occur. Splitting the table's ranges manually can allow the load on the table to be more evenly distributed across multiple nodes. For tables consisting of more than a few ranges, load will naturally be distributed across multiple nodes and this will not be a concern.
+- When a table only consists of a single range, all writes and reads to the table will be served by that range's [leaseholder]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases). If a table only holds a small amount of data but is serving a large amount of traffic, load distribution can become unbalanced and a [hotspot]({% link {{ page.version.version }}/understand-hotspots.md %}) can occur. Splitting the table's ranges manually can allow the load on the table to be more evenly distributed across multiple nodes. For tables consisting of more than a few ranges, load will naturally be distributed across multiple nodes and this will not be a concern.
-- When a table is created, it will only consist of a single range. If you know that a new table will immediately receive significant write traffic, you may want to preemptively split the table based on the expected distribution of writes before applying the load. This can help avoid reduced workload performance that results when automatic splits are unable to keep up with write traffic and a [hot spot]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) occurs.
+- When a table is created, it will only consist of a single range. If you know that a new table will immediately receive significant write traffic, you may want to preemptively split the table based on the expected distribution of writes before applying the load. This can help avoid reduced workload performance that results when automatic splits are unable to keep up with write traffic and a [hotspot]({% link {{ page.version.version }}/understand-hotspots.md %}) occurs.
diff --git a/src/current/_includes/v25.1/sql/use-the-default-transaction-priority.md b/src/current/_includes/v25.1/sql/use-the-default-transaction-priority.md
index f98742dd7c7..0a98718ff14 100644
--- a/src/current/_includes/v25.1/sql/use-the-default-transaction-priority.md
+++ b/src/current/_includes/v25.1/sql/use-the-default-transaction-priority.md
@@ -1,3 +1,3 @@
Cockroach Labs recommends leaving the transaction priority at the default setting in almost all cases. Changing the transaction priority to `HIGH` in particular can lead to difficult-to-debug interactions with other transactions executing on the system.
-If you are setting a transaction priority to avoid [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) or [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots), or to [get better query performance]({% link {{ page.version.version }}/make-queries-fast.md %}), it is usually a sign that you need to update your [schema design]({% link {{ page.version.version }}/schema-design-database.md %}) and/or review the data access patterns of your workload.
+If you are setting a transaction priority to avoid [contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) or [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}), or to [get better query performance]({% link {{ page.version.version }}/make-queries-fast.md %}), it is usually a sign that you need to update your [schema design]({% link {{ page.version.version }}/schema-design-database.md %}) and/or review the data access patterns of your workload.
diff --git a/src/current/_includes/v25.1/zone-configs/variables.md b/src/current/_includes/v25.1/zone-configs/variables.md
index 37008425787..4fffa6198f0 100644
--- a/src/current/_includes/v25.1/zone-configs/variables.md
+++ b/src/current/_includes/v25.1/zone-configs/variables.md
@@ -2,7 +2,7 @@ Variable | Description
------|------------
`range_min_bytes` | The minimum size, in bytes, for a range of data in the zone. When a range is less than this size, CockroachDB will merge it with an adjacent range.
**Default:** `134217728` (128 MiB)
`range_max_bytes` | The maximum size, in bytes, for a [range]({{link_prefix}}architecture/glossary.html#architecture-range) of data in the zone. When a range reaches this size, CockroachDB will [split it]({{link_prefix}}architecture/distribution-layer.html#range-splits) into two ranges.
**Default:** `536870912` (512 MiB)
-`gc.ttlseconds` | The number of seconds overwritten [MVCC values]({{link_prefix}}architecture/storage-layer.html#mvcc) will be retained before [garbage collection]({{link_prefix}}architecture/storage-layer.html#garbage-collection).
**Default:** `14400` (4 hours)
Smaller values can save disk space and improve performance if values are frequently overwritten or for queue-like workloads. The smallest value we regularly test is `600` (10 minutes); smaller values are unlikely to be beneficial because of the frequency with which GC runs. If you use [non-scheduled incremental backups](take-full-and-incremental-backups.html#garbage-collection-and-backups), the GC TTL must be greater than the interval between incremental backups. Otherwise, your incremental backups will fail with [the error message `protected ts verification error`](common-errors.html#protected-ts-verification-error). To avoid this problem, we recommend using [scheduled backups](create-schedule-for-backup.html) instead, which automatically [use protected timestamps](create-schedule-for-backup.html#protected-timestamps-and-scheduled-backups) to ensure they succeed.
Larger values increase the interval allowed for [`AS OF SYSTEM TIME`](as-of-system-time.html) queries and allow for less frequent incremental backups. The largest value we regularly test is `90000` (25 hours). Increasing the GC TTL is not meant to be a solution for long-term retention of history; for that you should handle versioning in the [schema design at the application layer](schema-design-overview.html). Setting the GC TTL too high can cause problems if the retained versions of a single row approach the [maximum range size](#range-max-bytes). This is important because all versions of a row are stored in a single range that never [splits](architecture/distribution-layer.html#range-splits).
+`gc.ttlseconds` | The number of seconds overwritten [MVCC values]({{link_prefix}}architecture/storage-layer.html#mvcc) will be retained before [garbage collection]({{link_prefix}}architecture/storage-layer.html#garbage-collection).
**Default:** `14400` (4 hours)
Smaller values can save disk space and improve performance if values are frequently overwritten or for [queue-like workloads]({{link_prefix}}understand-hotspots.html#queueing-hotspot). The smallest value we regularly test is `600` (10 minutes); smaller values are unlikely to be beneficial because of the frequency with which GC runs. If you use [non-scheduled incremental backups](take-full-and-incremental-backups.html#garbage-collection-and-backups), the GC TTL must be greater than the interval between incremental backups. Otherwise, your incremental backups will fail with [the error message `protected ts verification error`](common-errors.html#protected-ts-verification-error). To avoid this problem, we recommend using [scheduled backups](create-schedule-for-backup.html) instead, which automatically [use protected timestamps](create-schedule-for-backup.html#protected-timestamps-and-scheduled-backups) to ensure they succeed.
Larger values increase the interval allowed for [`AS OF SYSTEM TIME`](as-of-system-time.html) queries and allow for less frequent incremental backups. The largest value we regularly test is `90000` (25 hours). Increasing the GC TTL is not meant to be a solution for long-term retention of history; for that you should handle versioning in the [schema design at the application layer](schema-design-overview.html). Setting the GC TTL too high can cause problems if the retained versions of a single row approach the [maximum range size](#range-max-bytes). This is important because all versions of a row are stored in a single range that never [splits](architecture/distribution-layer.html#range-splits).
`num_replicas` | The number of replicas in the zone, also called the "replication factor".
**Default:** `3`
For the `system` database and `.meta`, `.liveness`, and `.system` ranges, the default value is `5`.
For [multi-region databases configured to survive region failures]({% link {{ page.version.version }}/multiregion-survival-goals.md %}#survive-region-failures), the default value is `5`; this will include both [voting](#num_voters) and [non-voting replicas]({% link {{ page.version.version }}/architecture/replication-layer.md %}#non-voting-replicas).
`constraints` | An array of required (`+`) and/or prohibited (`-`) constraints influencing the location of replicas. See [Types of Constraints]({% link {{ page.version.version }}/configure-replication-zones.md %}#types-of-constraints) and [Scope of Constraints]({% link {{ page.version.version }}/configure-replication-zones.md %}#scope-of-constraints) for more details.
To prevent hard-to-detect typos, constraints placed on [store attributes and node localities]({% link {{ page.version.version }}/configure-replication-zones.md %}#descriptive-attributes-assigned-to-nodes) must match the values passed to at least one node in the cluster. If not, an error is signalled. To prevent this error, make sure at least one active node is configured to match the constraint. For example, apply `constraints = '[+region=west]'` only if you had set `--locality=region=west` for at least one node while starting the cluster.
**Default:** No constraints, with CockroachDB locating each replica on a unique node and attempting to spread replicas evenly across localities.
`lease_preferences` | An ordered list of required and/or prohibited constraints influencing the location of [leaseholders]({% link {{ page.version.version }}/architecture/glossary.md %}#architecture-leaseholder). Whether each constraint is required or prohibited is expressed with a leading `+` or `-`, respectively. Note that lease preference constraints do not have to be shared with the `constraints` field. For example, it's valid for your configuration to define a `lease_preferences` field that does not reference any values from the `constraints` field. It's also valid to define a `lease_preferences` field with no `constraints` field at all.
If the first preference cannot be satisfied, CockroachDB will attempt to satisfy the second preference, and so on. If none of the preferences can be met, the lease will be placed using the default lease placement algorithm, which is to base lease placement decisions on how many leases each node already has, trying to make all the nodes have around the same amount.
Each value in the list can include multiple constraints. For example, the list `[[+zone=us-east-1b, +ssd], [+zone=us-east-1a], [+zone=us-east-1c, +ssd]]` means "prefer nodes with an SSD in `us-east-1b`, then any nodes in `us-east-1a`, then nodes in `us-east-1c` with an SSD."
For a usage example, see [Constrain leaseholders to specific availability zones]({% link {{ page.version.version }}/configure-replication-zones.md %}#constrain-leaseholders-to-specific-availability-zones).
**Default**: No lease location preferences are applied if this field is not specified.
diff --git a/src/current/v25.1/admission-control.md b/src/current/v25.1/admission-control.md
index 28610579ec4..9e138bae6c4 100644
--- a/src/current/v25.1/admission-control.md
+++ b/src/current/v25.1/admission-control.md
@@ -18,7 +18,7 @@ For CPU, different types of usage are queued differently based on priority to al
For storage IO, the goal is to prevent the [storage layer's log-structured merge tree]({% link {{ page.version.version }}/architecture/storage-layer.md %}#log-structured-merge-trees) (LSM) from experiencing high [read amplification]({% link {{ page.version.version }}/architecture/storage-layer.md %}#read-amplification), which slows down reads, while also maintaining the ability to absorb bursts of writes.
-Admission control works on a per-[node]({% link {{ page.version.version }}/architecture/overview.md %}#node) basis, since even though a large CockroachDB cluster may be well-provisioned as a whole, individual nodes are stateful and may experience performance [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+Admission control works on a per-[node]({% link {{ page.version.version }}/architecture/overview.md %}#node) basis, since even though a large CockroachDB cluster may be well-provisioned as a whole, individual nodes are stateful and may experience performance [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
For more details about how the admission control system works, see:
@@ -27,7 +27,7 @@ For more details about how the admission control system works, see:
## Use cases for admission control
-A well-provisioned CockroachDB cluster may still encounter performance bottlenecks at the node level, as stateful nodes can develop [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) that last until the cluster rebalances itself. When hot spots occur, they should not cause failures or degraded performance for important work.
+A well-provisioned CockroachDB cluster may still encounter performance bottlenecks at the node level, as stateful nodes can develop [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) that last until the cluster rebalances itself. When hotspots occur, they should not cause failures or degraded performance for important work.
This is particularly important for CockroachDB {{ site.data.products.standard }} and CockroachDB {{ site.data.products.basic }}, where one user tenant cluster experiencing high load should not degrade the performance or availability of a different, isolated tenant cluster running on the same host.
diff --git a/src/current/v25.1/common-issues-to-monitor.md b/src/current/v25.1/common-issues-to-monitor.md
index c0cb43c74e7..60595f5c993 100644
--- a/src/current/v25.1/common-issues-to-monitor.md
+++ b/src/current/v25.1/common-issues-to-monitor.md
@@ -27,7 +27,7 @@ Provision enough CPU to support your operational and workload concurrency requir
| Category | Recommendations |
|----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| CPU |
- {{ cpu_recommendation_minimum | strip_newlines }}
- {{ cpu_recommendation_maximum | strip_newlines }}
- Use larger VMs to handle temporary workload spikes and processing hot spots.
- Use connection pooling to manage workload concurrency. {% include {{ page.version.version }}/prod-deployment/prod-guidance-connection-pooling.md %} For more details, refer to [Size connection pools]({% link {{ page.version.version }}/connection-pooling.md %}#size-connection-pools).
- For additional CPU recommendations, refer to [Recommended Production Settings]({% link {{ page.version.version }}/recommended-production-settings.md %}#sizing).
|
+| CPU | - {{ cpu_recommendation_minimum | strip_newlines }}
- {{ cpu_recommendation_maximum | strip_newlines }}
- Use larger VMs to handle temporary workload spikes and processing [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
- Use connection pooling to manage workload concurrency. {% include {{ page.version.version }}/prod-deployment/prod-guidance-connection-pooling.md %} For more details, refer to [Size connection pools]({% link {{ page.version.version }}/connection-pooling.md %}#size-connection-pools).
- For additional CPU recommendations, refer to [Recommended Production Settings]({% link {{ page.version.version }}/recommended-production-settings.md %}#sizing).
|
### CPU monitoring
diff --git a/src/current/v25.1/hash-sharded-indexes.md b/src/current/v25.1/hash-sharded-indexes.md
index b1ab5c204f9..9ee74d592a8 100644
--- a/src/current/v25.1/hash-sharded-indexes.md
+++ b/src/current/v25.1/hash-sharded-indexes.md
@@ -1,11 +1,11 @@
---
title: Index Sequential Keys with Hash-sharded Indexes
-summary: Hash-sharded indexes can eliminate single-range hot spots and improve write performance on sequentially-keyed indexes at a small cost to read performance
+summary: Hash-sharded indexes can eliminate single-range hotspots and improve write performance on sequentially-keyed indexes at a small cost to read performance
toc: true
docs_area: develop
---
-If you are working with a table that must be indexed on sequential keys, you should use **hash-sharded indexes**. Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range hot spots and improving write performance on sequentially-keyed indexes at a small cost to read performance.
+If you are working with a table that must be indexed on sequential keys, you should use **hash-sharded indexes**. Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range hotspots and improving write performance on sequentially-keyed indexes at a small cost to read performance.
{{site.data.alerts.callout_info}}
Hash-sharded indexes are an implementation of hash partitioning, not hash indexing.
@@ -15,7 +15,7 @@ Hash-sharded indexes are an implementation of hash partitioning, not hash indexi
### Overview
-CockroachDB automatically splits ranges of data in [the key-value store]({% link {{ page.version.version }}/architecture/storage-layer.md %}) based on [the size of the range]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits) and on [the load streaming to the range]({% link {{ page.version.version }}/load-based-splitting.md %}). To split a range based on load, the system looks for a point in the range that evenly divides incoming traffic. If the range is indexed on a column of data that is sequential in nature (e.g., [an ordered sequence]({% link {{ page.version.version }}/sql-faqs.md %}#what-are-the-differences-between-uuid-sequences-and-unique_rowid) or a series of increasing, non-repeating [`TIMESTAMP`s](timestamp.html)), then all incoming writes to the range will be the last (or first) item in the index and appended to the end of the range. As a result, the system cannot find a point in the range that evenly divides the traffic, and the range cannot benefit from load-based splitting, creating a [hot spot](performance-best-practices-overview.html#hot-spots) on the single range.
+CockroachDB automatically splits ranges of data in [the key-value store]({% link {{ page.version.version }}/architecture/storage-layer.md %}) based on [the size of the range]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits) and on [the load streaming to the range]({% link {{ page.version.version }}/load-based-splitting.md %}). To split a range based on load, the system looks for a point in the range that evenly divides incoming traffic. If the range is indexed on a column of data that is sequential in nature (e.g., [an ordered sequence]({% link {{ page.version.version }}/sql-faqs.md %}#what-are-the-differences-between-uuid-sequences-and-unique_rowid) or a series of increasing, non-repeating [`TIMESTAMP`s](timestamp.html)), then all incoming writes to the range will be the last (or first) item in the index and appended to the end of the range. As a result, the system cannot find a point in the range that evenly divides the traffic, and the range cannot benefit from load-based splitting, creating a [hotspot]({% link {{ page.version.version }}/understand-hotspots.md %}) on the single range.
Hash-sharded indexes solve this problem by distributing sequential data across multiple nodes within your cluster, eliminating hotspots. The trade-off to this, however, is a small performance impact on reading sequential data or ranges of data, as it's not guaranteed that sequentially close values will be on the same node.
diff --git a/src/current/v25.1/load-based-splitting.md b/src/current/v25.1/load-based-splitting.md
index 58e1e4903cc..cd8e999a409 100644
--- a/src/current/v25.1/load-based-splitting.md
+++ b/src/current/v25.1/load-based-splitting.md
@@ -87,7 +87,7 @@ You can see how often a split key cannot be found over time by looking at the fo
This metric is directly related to the log message described above.
-For more information about how to reduce hot spots (including hot ranges) on your cluster, see [Hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+For more information about how to reduce hotspots (including hot ranges) on your cluster, refer to [Understand hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
## See also
diff --git a/src/current/v25.1/make-queries-fast.md b/src/current/v25.1/make-queries-fast.md
index 06bbe88dfea..183056944a1 100644
--- a/src/current/v25.1/make-queries-fast.md
+++ b/src/current/v25.1/make-queries-fast.md
@@ -8,7 +8,7 @@ docs_area: develop
This page provides an overview for optimizing statement performance in CockroachDB. To get good performance, you need to look at how you're accessing the database through several lenses:
- [SQL statement performance](#sql-statement-performance-rules): This is the most common cause of performance problems and where you should start.
-- [Schema design](#schema-design): Depending on your SQL schema and the data access patterns of your workload, you may need to make changes to avoid creating [transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) or [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+- [Schema design](#schema-design): Depending on your SQL schema and the data access patterns of your workload, you may need to make changes to avoid creating [transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention) or [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
- [Cluster topology](#cluster-topology): As a distributed system, CockroachDB requires you to trade off latency vs. resiliency. This requires choosing the right cluster topology for your needs.
## SQL statement performance rules
@@ -28,7 +28,7 @@ For an example of applying the rules to a query, see [Apply SQL Statement Perfor
If you are following the instructions in [the SQL performance section](#sql-statement-performance-rules) and still not getting the performance you want, you may need to look at your schema design and data access patterns to make sure that you are not:
- Introducing transaction contention. For methods for diagnosing and mitigating transaction contention, see [Transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention).
-- Creating hot spots in your cluster. For methods for detecting and eliminating hot spots, see [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+- Creating hotspots in your cluster. For methods for detecting and eliminating hotspots, refer to [Understand hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#).
## Cluster topology
diff --git a/src/current/v25.1/pagination.md b/src/current/v25.1/pagination.md
index 8871e27c91c..893b5c3c54e 100644
--- a/src/current/v25.1/pagination.md
+++ b/src/current/v25.1/pagination.md
@@ -208,7 +208,7 @@ Time: 1ms total (execution 1ms / network 0ms)
As shown by the `estimated row count` row, this query scans only 25 rows, far fewer than the 200049 scanned by the `LIMIT`/`OFFSET` query.
{{site.data.alerts.callout_danger}}
-Using a sequential (i.e., non-[UUID]({% link {{ page.version.version }}/uuid.md %})) primary key creates hot spots in the database for write-heavy workloads, since concurrent [`INSERT`]({% link {{ page.version.version }}/insert.md %})s to the table will attempt to write to the same (or nearby) underlying [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range). This can be mitigated by designing your schema with [multi-column primary keys which include a monotonically increasing column]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#use-multi-column-primary-keys).
+Using a sequential (i.e., non-[UUID]({% link {{ page.version.version }}/uuid.md %})) primary key creates hotspots in the database for write-heavy workloads, since concurrent [`INSERT`]({% link {{ page.version.version }}/insert.md %})s to the table will attempt to write to the same (or nearby) underlying [ranges]({% link {{ page.version.version }}/architecture/overview.md %}#architecture-range). This can be mitigated by designing your schema with [multi-column primary keys which include a monotonically increasing column]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#use-multi-column-primary-keys).
{{site.data.alerts.end}}
## Differences between keyset pagination and cursors
diff --git a/src/current/v25.1/performance-best-practices-overview.md b/src/current/v25.1/performance-best-practices-overview.md
index cf258986a49..1b88e654f6f 100644
--- a/src/current/v25.1/performance-best-practices-overview.md
+++ b/src/current/v25.1/performance-best-practices-overview.md
@@ -69,7 +69,7 @@ The best practices for generating unique IDs in a distributed database like Cock
1. Using the [`SERIAL`]({% link {{ page.version.version }}/serial.md %}) pseudo-type for a column to generate random unique IDs. This can result in a performance bottleneck because IDs generated temporally near each other have similar values and are located physically near each other in a table's storage.
1. Generating monotonically increasing [`INT`]({% link {{ page.version.version }}/int.md %}) IDs by using transactions with roundtrip [`SELECT`]({% link {{ page.version.version }}/select-clause.md %})s, e.g., `INSERT INTO tbl (id, …) VALUES ((SELECT max(id)+1 FROM tbl), …)`. This has a **very high performance cost** since it makes all [`INSERT`]({% link {{ page.version.version }}/insert.md %}) transactions wait for their turn to insert the next ID. You should only do this if your application really does require strict ID ordering. In some cases, using [change data capture (CDC)]({% link {{ page.version.version }}/change-data-capture-overview.md %}) can help avoid the requirement for strict ID ordering. If you can avoid the requirement for strict ID ordering, you can use one of the higher-performance ID strategies outlined in the following sections.
-The preceding approaches are likely to create [hot spots](#hot-spots) for both reads and writes in CockroachDB. {% include {{page.version.version}}/performance/use-hash-sharded-indexes.md %}
+The preceding approaches are likely to create [hotspots](#hotspots) for both reads and writes in CockroachDB. {% include {{page.version.version}}/performance/use-hash-sharded-indexes.md %}
To create unique and non-sequential IDs, we recommend the following approaches (listed in order from best to worst performance):
@@ -83,7 +83,7 @@ To create unique and non-sequential IDs, we recommend the following approaches (
A well-designed multi-column primary key can yield even better performance than a [UUID primary key](#use-functions-to-generate-unique-ids), but it requires more up-front schema design work. To get the best performance, ensure that any monotonically increasing field is located **after** the first column of the primary key. When done right, such a composite primary key should result in:
-- Enough randomness in your primary key to spread the table data / query load relatively evenly across the cluster, which will avoid hot spots. By "enough randomness" we mean that the prefix of the primary key should be relatively uniformly distributed over its domain. Its domain should have at least as many elements as you have nodes.
+- Enough randomness in your primary key to spread the table data / query load relatively evenly across the cluster, which will avoid hotspots. By "enough randomness" we mean that the prefix of the primary key should be relatively uniformly distributed over its domain. Its domain should have at least as many elements as you have nodes.
- A monotonically increasing column that is part of the primary key (and thus indexed) which is also useful in your queries.
For example, consider a social media website. Social media posts are written by users, and on login the user's last 10 posts are displayed. A good choice for a primary key might be `(username, post_timestamp)`. For example:
@@ -343,9 +343,9 @@ By default under [`SERIALIZABLE`]({% link {{ page.version.version }}/demo-serial
- [Delays in query completion]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#hanging-or-stuck-queries). This occurs when multiple transactions are trying to write to the same "locked" data at the same time, making a transaction unable to complete. This is also known as *lock contention*.
- [Transaction retries]({% link {{ page.version.version }}/transactions.md %}#automatic-retries) performed automatically by CockroachDB. This occurs if a transaction cannot be placed into a serializable ordering among all of the currently-executing transactions. This is also called a *serialization conflict*.
- [Transaction retry errors]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}), which are emitted to your client when an automatic retry is not possible or fails. Under `SERIALIZABLE` isolation, your application must address transaction retry errors with [client-side retry handling]({% link {{ page.version.version }}/transaction-retry-error-reference.md %}#client-side-retry-handling).
-- [Cluster hot spots](#hot-spots).
+- [Cluster hotspots](#hotspots).
-To mitigate these effects, [reduce the causes of transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#reduce-transaction-contention) and [reduce hot spots](#reduce-hot-spots). For further background on transaction contention, see [What is Database Contention, and Why Should You Care?](https://www.cockroachlabs.com/blog/what-is-database-contention/).
+To mitigate these effects, [reduce the causes of transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#reduce-transaction-contention) and [reduce hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#reduce-hotspots). For further background on transaction contention, see [What is Database Contention, and Why Should You Care?](https://www.cockroachlabs.com/blog/what-is-database-contention/).
### Reduce transaction contention
@@ -361,24 +361,11 @@ To maximize transaction performance, you'll need to maximize the performance of
- Use the fastest [storage devices]({% link {{ page.version.version }}/recommended-production-settings.md %}#storage) available.
- If the contending transactions operate on different keys within the same range, add [more CPU power (more cores) per node]({% link {{ page.version.version }}/recommended-production-settings.md %}#sizing). However, if the transactions all operate on the same key, this may not provide an improvement.
-## Hot spots
+## Hotspots
-A *hot spot* is any location on the cluster receiving significantly more requests than another. Hot spots are a symptom of *resource contention* and can create problems as requests increase, including excessive [transaction contention](#transaction-contention).
+A *hotspot* is any location on the cluster receiving significantly more requests than another. Hotspots are a symptom of *resource contention* and can create problems as requests increase, including excessive [transaction contention](#transaction-contention).
-[Hot spots occur]({% link {{ page.version.version }}/performance-recipes.md %}#indicators-that-your-cluster-has-hot-spots) when an imbalanced workload access pattern causes significantly more reads and writes on a subset of data. For example:
-
-- Transactions operate on the **same range but different index keys**. These operations are limited by the overall hardware capacity of [the range leaseholder]({% link {{ page.version.version }}/architecture/overview.md %}#cockroachdb-architecture-terms) node.
-- A range is indexed on a column of data that is sequential in nature (e.g., [an ordered sequence]({% link {{ page.version.version }}/sql-faqs.md %}#what-are-the-differences-between-uuid-sequences-and-unique_rowid), or a series of increasing, non-repeating [`TIMESTAMP`s]({% link {{ page.version.version }}/timestamp.md %})), such that all incoming writes to the range will be the last (or first) item in the index and appended to the end of the range. Because the system is unable to find a split point in the range that evenly divides the traffic, the range cannot benefit from [load-based splitting]({% link {{ page.version.version }}/load-based-splitting.md %}). This creates a hot spot at the single range.
-
-Read hot spots can occur if you perform lots of scans of a portion of a table index or a single key.
-
-### Reduce hot spots
-
-{% include {{ page.version.version }}/performance/reduce-hot-spots.md %}
-
-For a demo on hot spot reduction, watch the following video:
-
-{% include_cached youtube.html video_id="j15k01NeNNA" %}
+For a detailed explanation of hotspot causes and mitigation strategies, refer to [Understand Hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
## See also
diff --git a/src/current/v25.1/performance-recipes.md b/src/current/v25.1/performance-recipes.md
index bd6b083671c..8666e2b6d53 100644
--- a/src/current/v25.1/performance-recipes.md
+++ b/src/current/v25.1/performance-recipes.md
@@ -42,8 +42,8 @@ This section describes how to use CockroachDB commands and dashboards to identif
The Hot Ranges page (DB Console) displays a higher-than-expected QPS for a range.
The Key Visualizer (DB Console) shows ranges with much higher-than-average write rates for the cluster.
- |
- |
+ |
+ |
@@ -182,21 +182,23 @@ to [view the tables, indexes, and transactions with the most time under contenti
{% include {{ page.version.version }}/performance/reduce-contention.md %}
-### Hot spots
+### Hotspots
-[Hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) are a symptom of *resource contention* and can create problems as requests increase, including excessive [transaction contention](#transaction-contention).
+Hotspots are a symptom of *resource contention* and can create problems as requests increase, including excessive [transaction contention](#transaction-contention).
-#### Indicators that your cluster has hot spots
+For a detailed explanation of hotspot causes and mitigation strategies, refer to the [Understanding Hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) page.
+
+#### Indicators that your cluster has hotspots
- The **CPU Percent** graph on the [**Hardware**]({% link {{ page.version.version }}/ui-hardware-dashboard.md %}) and [**Overload**]({% link {{ page.version.version }}/ui-overload-dashboard.md %}) dashboards (DB Console) shows spikes in CPU usage.
- The **Hot Ranges** list on the [**Hot Ranges** page]({% link {{ page.version.version }}/ui-hot-ranges-page.md %}) (DB Console) displays a higher-than-expected QPS for a range.
-- The [**Key Visualizer**]({% link {{ page.version.version }}/ui-key-visualizer.md %}) (DB Console) shows [ranges with much higher-than-average write rates]({% link {{ page.version.version }}/ui-key-visualizer.md %}#identifying-hot-spots) for the cluster.
+- The [**Key Visualizer**]({% link {{ page.version.version }}/ui-key-visualizer.md %}) (DB Console) shows [ranges with much higher-than-average write rates]({% link {{ page.version.version }}/ui-key-visualizer.md %}#identifying-hotspots) for the cluster.
-If you find hot spots, use the [**Range Report**]({% link {{ page.version.version }}/ui-hot-ranges-page.md %}#range-report) and [**Key Visualizer**]({% link {{ page.version.version }}/ui-key-visualizer.md %}) to identify the ranges with excessive traffic. Then take steps to [reduce hot spots](#reduce-hot-spots).
+If you find hotspots, use the [**Range Report**]({% link {{ page.version.version }}/ui-hot-ranges-page.md %}#range-report) and [**Key Visualizer**]({% link {{ page.version.version }}/ui-key-visualizer.md %}) to identify the ranges with excessive traffic. Then take steps to [reduce hotspots](#reduce-hotspots).
-#### Reduce hot spots
+#### Reduce hotspots
-{% include {{ page.version.version }}/performance/reduce-hot-spots.md %}
+{% include {{ page.version.version }}/performance/reduce-hotspots.md %}
### Statements with full table scans
diff --git a/src/current/v25.1/query-behavior-troubleshooting.md b/src/current/v25.1/query-behavior-troubleshooting.md
index 0a0fa9b69a6..a06bb233110 100644
--- a/src/current/v25.1/query-behavior-troubleshooting.md
+++ b/src/current/v25.1/query-behavior-troubleshooting.md
@@ -284,7 +284,7 @@ For more information about the SQL standard features supported by CockroachDB, s
### Single hot node
-A *hot node* is one that has much higher resource usage than other nodes. To determine if you have a hot node in your cluster, [access the DB Console]({% link {{ page.version.version }}/ui-overview.md %}#db-console-access) and check the following:
+A [*hot node*]({% link {{ page.version.version }}/understand-hotspots.md %}#hot-node) is one that has much higher resource usage than other nodes. To determine if you have a hot node in your cluster, [access the DB Console]({% link {{ page.version.version }}/ui-overview.md %}#db-console-access) and check the following:
- Click **Metrics** and navigate to the following graphs. Hover over each graph to see the per-node values of the metrics. If one of the nodes has a higher value, you have a hot node in your cluster.
- [**Replication** dashboard]({% link {{ page.version.version }}/ui-replication-dashboard.md %}#other-graphs) > **Average Queries per Store** graph
@@ -306,7 +306,7 @@ A *hot node* is one that has much higher resource usage than other nodes. To det
- If you have a monotonically increasing index column or primary Key, then your index or primary key should be redesigned. For more information, see [Unique ID best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#unique-id-best-practices).
-- If a range has significantly higher QPS on a node, there may be a hot spot on the range that needs to be reduced. For more information, see [Hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+- If a range has significantly higher QPS on a node, it may indicate a hotspot that needs to be addressed. For more information, refer to [Hot range]({% link {{ page.version.version }}/understand-hotspots.md %}#hot-range).
- If you have a monotonically increasing index column or primary key, then your index or primary key should be redesigned. See [Unique ID best practices]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#unique-id-best-practices) for more information.
diff --git a/src/current/v25.1/recommended-production-settings.md b/src/current/v25.1/recommended-production-settings.md
index 019c24f047a..8c8d65f1409 100644
--- a/src/current/v25.1/recommended-production-settings.md
+++ b/src/current/v25.1/recommended-production-settings.md
@@ -42,7 +42,7 @@ Carefully consider the following tradeoffs:
- A **smaller number of larger nodes** emphasizes cluster stability.
- - Larger nodes tolerate hot spots more effectively than smaller nodes.
+ - Larger nodes tolerate [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) more effectively than smaller nodes.
- Queries operating on large data sets may strain network transfers if the data is spread widely over many smaller nodes. Having fewer and larger nodes enables more predictable workload performance.
- A cluster with fewer nodes may be easier to operate and maintain.
diff --git a/src/current/v25.1/schema-design-indexes.md b/src/current/v25.1/schema-design-indexes.md
index d53d7a4506f..7b8d5cb36e7 100644
--- a/src/current/v25.1/schema-design-indexes.md
+++ b/src/current/v25.1/schema-design-indexes.md
@@ -83,7 +83,7 @@ The [`EXPLAIN`]({% link {{ page.version.version }}/explain.md %}#success-respons
- If you need to index the result of a function applied to one or more columns of a single table, use the function to create a [computed column]({% link {{ page.version.version }}/computed-columns.md %}) and index the column.
-- Avoid indexing on sequential keys (e.g., [`TIMESTAMP`/`TIMESTAMPTZ`]({% link {{ page.version.version }}/timestamp.md %}) columns). Writes to indexes with sequential keys can result in range [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) that negatively affect performance. Instead, use [randomly generated unique IDs]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#unique-id-best-practices) or [multi-column keys]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#use-multi-column-primary-keys).
+- Avoid indexing on sequential keys (e.g., [`TIMESTAMP`/`TIMESTAMPTZ`]({% link {{ page.version.version }}/timestamp.md %}) columns). Writes to indexes with sequential keys can result in [range hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#hot-range) that negatively affect performance. Instead, use [randomly generated unique IDs]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#unique-id-best-practices) or [multi-column keys]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#use-multi-column-primary-keys).
If you are working with a table that **must** be indexed on sequential keys, use [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}). For details about the mechanics and performance improvements of hash-sharded indexes in CockroachDB, see our [Hash Sharded Indexes Unlock Linear Scaling for Sequential Workloads](https://www.cockroachlabs.com/blog/hash-sharded-indexes-unlock-linear-scaling-for-sequential-workloads/) blog post.
diff --git a/src/current/v25.1/schema-design-table.md b/src/current/v25.1/schema-design-table.md
index a6958279e7c..dd32022260d 100644
--- a/src/current/v25.1/schema-design-table.md
+++ b/src/current/v25.1/schema-design-table.md
@@ -211,7 +211,7 @@ Here are some best practices to follow when selecting primary key columns:
- Avoid defining primary keys over a single column of sequential data.
- Querying a table with a primary key on a single sequential column (e.g., an auto-incrementing [`INT`]({% link {{ page.version.version }}/int.md %}) column, or a [`TIMESTAMP`]({% link {{ page.version.version }}/timestamp.md %}) value) can result in single-range [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) that negatively affect performance, or cause [transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention).
+ Querying a table with a primary key on a single sequential column (e.g., an auto-incrementing [`INT`]({% link {{ page.version.version }}/int.md %}) column, or a [`TIMESTAMP`]({% link {{ page.version.version }}/timestamp.md %}) value) can result in single-range [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) that negatively affect performance, or cause [transaction contention]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#transaction-contention).
If you are working with a table that *must* be indexed on sequential keys, use [hash-sharded indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %}). For details about the mechanics and performance improvements of hash-sharded indexes in CockroachDB, see our [Hash Sharded Indexes Unlock Linear Scaling for Sequential Workloads](https://www.cockroachlabs.com/blog/hash-sharded-indexes-unlock-linear-scaling-for-sequential-workloads/) blog post.
diff --git a/src/current/v25.1/ui-hot-ranges-page.md b/src/current/v25.1/ui-hot-ranges-page.md
index e9a85cac64c..fb0414adf3f 100644
--- a/src/current/v25.1/ui-hot-ranges-page.md
+++ b/src/current/v25.1/ui-hot-ranges-page.md
@@ -9,9 +9,9 @@ docs_area: reference.db_console
On a secure cluster, this area of the DB Console can only be accessed by users belonging to the [`admin` role]({% link {{ page.version.version }}/security-reference/authorization.md %}#admin-role) or a SQL user with the [`VIEWCLUSTERMETADATA`]({% link {{ page.version.version }}/security-reference/authorization.md %}#viewclustermetadata) [system privilege]({% link {{ page.version.version }}/security-reference/authorization.md %}#supported-privileges) (or the legacy `VIEWACTIVITY` or `VIEWACTIVITYREDACTED` [role option]({% link {{ page.version.version }}/security-reference/authorization.md %}#role-options)) defined. The [`VIEWACTIVITY`]({% link {{ page.version.version }}/security-reference/authorization.md %}#viewactivity) or [`VIEWACTIVITYREDACTED`]({% link {{ page.version.version }}/security-reference/authorization.md %}#viewactivityredacted) [system privileges]({% link {{ page.version.version }}/security-reference/authorization.md %}#supported-privileges) **do not** grant access to this page.
{{site.data.alerts.end}}
-The **Hot Ranges** page of the DB Console provides details about ranges receiving a high number of reads or writes. These are known as *hot ranges*.
+The **Hot Ranges** page of the DB Console provides details about ranges receiving a high number of reads or writes. These are known as [*hot ranges*]({% link {{ page.version.version }}/understand-hotspots.md %}#hot-range).
-When [optimizing]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) or [troubleshooting]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#single-hot-node) statement performance, this page can help you identify nodes, ranges, or tables that are experiencing hot spots.
+When optimizing or troubleshooting statement performance, this page can help you identify nodes, ranges, or tables that are experiencing [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}).
To view this page, [access the DB Console]({% link {{ page.version.version }}/ui-overview.md %}#db-console-access) and click **Hot Ranges** in the left-hand navigation.
@@ -26,7 +26,7 @@ The **Hot ranges** list displays the ranges with the highest queries per second
{{site.data.alerts.callout_info}}
Hot ranges are not necessarily problematic. Some ranges naturally experience higher QPS than others. For example, a range for a frequently accessed table will have a higher QPS.
-However, a significant increase in traffic can also indicate a *hot spot* on the range that should be reduced. For more information, see [Hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots).
+However, a significant increase in traffic can also indicate a *hotspot* on the range that should be reduced. For more information, refer to [Understand hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#hot-range).
{{site.data.alerts.end}}
To view the [Range Report](#range-report) for a hot range, click its range ID.
@@ -52,13 +52,13 @@ Locality | The locality of the node where the range data is found.
The **Range Report** is typically used for [advanced debugging]({% link {{ page.version.version }}/ui-debug-pages.md %}#even-more-advanced-debugging) purposes.
-If your aim is to [reduce hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots), refer to the following fields:
+If your aim is to [reduce hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#reduce-hotspots), refer to the following fields:
- `Key Range` shows the interval of the [key space]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-descriptors) that is "hottest" (i.e., read by the processor). This is expressed as a span of key values.
- `Lease Holder QPS` shows the queries executed per second on the node that holds the [range lease]({% link {{ page.version.version }}/architecture/replication-layer.md %}#leases). If a hot range is not properly using [load-based splitting]({% link {{ page.version.version }}/load-based-splitting.md %}), this will be greater than the value configured by the `kv.range_split.load_qps_threshold` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}) (`2500` by default).
## See also
-- [Hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots)
+- [Understand Hotspots]({% link {{ page.version.version }}/understand-hotspots.md %})
- [Hash-sharded Indexes]({% link {{ page.version.version }}/hash-sharded-indexes.md %})
- [Architecture Overview]({% link {{ page.version.version }}/architecture/overview.md %})
\ No newline at end of file
diff --git a/src/current/v25.1/ui-key-visualizer.md b/src/current/v25.1/ui-key-visualizer.md
index de0e362f925..2f35ae23eeb 100644
--- a/src/current/v25.1/ui-key-visualizer.md
+++ b/src/current/v25.1/ui-key-visualizer.md
@@ -13,7 +13,7 @@ docs_area: reference.db_console
The **Key Visualizer** page of the DB Console provides access to the Key Visualizer tool, which enables the visualization of current and historical [key-value (KV)]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#table-data-kv-structure) traffic serviced by your cluster.
-The Key Visualizer is a useful troubleshooting tool when experiencing performance problems with your cluster, surfacing historical and current KV [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots) in your keyspace, drawing attention to [range splits]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits), and highlighting potentially-unnecessary [full-table scans]({% link {{ page.version.version }}/make-queries-fast.md %}) that might benefit from the creation of a targeted index, among others.
+The Key Visualizer is a useful troubleshooting tool when experiencing performance problems with your cluster, surfacing historical and current KV [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}) in your keyspace, drawing attention to [range splits]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits), and highlighting potentially-unnecessary [full-table scans]({% link {{ page.version.version }}/make-queries-fast.md %}) that might benefit from the creation of a targeted index, among others.
The Key Visualizer is disabled by default. Once [enabled](#enable-the-key-visualizer), the Key Visualizer continuously collects keyspace usage data across your cluster in the background at a [configurable sampling rate](#key-visualizer-customization). Data shown in the Key Visualizer is retained for a maximum period of seven days.
@@ -47,7 +47,7 @@ The Key Visualizer presents the following information:
- Time is represented on the X-axis, with its granularity (i.e., frequency of data collection) being controlled by the [configured sample period](#key-visualizer-customization).
-- Keyspace activity is visualized on a color scale from black to red, representing "cold" and "hot" respectively. Thus, a range shown in bright red indicates a [hot spot]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots), while a range shown in black indicates a range with little to no active reads or writes. Hot spots are identified relative to other ranges; for example, a range that receives one write per minute could be considered a hot spot if the rest of the ranges on the cluster aren't receiving any. A range shown in red is therefore not necessarily itself indicative of a problem, but it may help to narrow a problem down to a specific range or group of ranges when troubleshooting cluster performance.
+- Keyspace activity is visualized on a color scale from black to red, representing "cold" and "hot" respectively. Thus, a range shown in bright red indicates a [hotspot]({% link {{ page.version.version }}/understand-hotspots.md %}), while a range shown in black indicates a range with little to no active reads or writes. Hotspots are identified relative to other ranges; for example, a range that receives one write per minute could be considered a hotspot if the rest of the ranges on the cluster aren't receiving any. A range shown in red is therefore not necessarily itself indicative of a problem, but it may help to narrow a problem down to a specific range or group of ranges when troubleshooting cluster performance.
- Boundaries between buckets and time samples appear as grey lines. You can disable the drawing of these lines by deselecting the **Show span boundaries** checkbox below the Key Visualizer.
@@ -78,13 +78,13 @@ When troubleshooting a performance issue with your cluster, use the Key Visualiz
The Key Visualizer was designed to make potentially problematic ranges stand out visually; as such, bright red spot are generally good places to begin a performance investigation. For example, consider the following cases:
-### Identifying hot spots
+### Identifying hotspots
-The following image shows the Key Visualizer highlighting a series of [hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots): ranges with much higher-than-average write rates as compared to the rest of the cluster.
+The following image shows the Key Visualizer highlighting a series of [hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}): ranges with much higher-than-average write rates as compared to the rest of the cluster.
-**Remediation:** If you've identified a potentially-problematic range as a hot spot, follow the recommended best practices to [reduce hot spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#reduce-hot-spots). In the case of the screenshot above, the increased write cadence is due to a series of [range splits]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits), where a range experiencing a large volume of incoming writes is splitting its keyspace to accommodate the growing range. This is often part of normal operation, but can be indicative of a data modeling issue if the range split is unexpected or causing cluster performance issues.
+**Remediation:** If you've identified a potentially-problematic range as a hotspot, follow the recommended best practices to [reduce hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#reduce-hotspots). In the case of the screenshot above, the increased write cadence is due to a series of [range splits]({% link {{ page.version.version }}/architecture/distribution-layer.md %}#range-splits), where a range experiencing a large volume of incoming writes is splitting its keyspace to accommodate the growing range. This is often part of normal operation, but can be indicative of a data modeling issue if the range split is unexpected or causing cluster performance issues.
### Identifying full table scans
@@ -99,5 +99,5 @@ The following image shows the Key Visualizer highlighting a [full-table scan]({%
- [DB Console Overview]({% link {{ page.version.version }}/ui-overview.md %})
- [Troubleshooting Overview]({% link {{ page.version.version }}/troubleshooting-overview.md %})
- [Hot Ranges Page]({% link {{ page.version.version }}/ui-hot-ranges-page.md %})
-- [Reduce Hot Spots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#reduce-hot-spots)
+- [Reduce Hotspots]({% link {{ page.version.version }}/understand-hotspots.md %}#reduce-hotspots)
- [Support Resources]({% link {{ page.version.version }}/support-resources.md %})
diff --git a/src/current/v25.1/understand-hotspots.md b/src/current/v25.1/understand-hotspots.md
index 50183042f89..0fb41e75658 100644
--- a/src/current/v25.1/understand-hotspots.md
+++ b/src/current/v25.1/understand-hotspots.md
@@ -1,12 +1,12 @@
---
title: Understand Hotspots
-summary: Learn about the terminology and patterns of hotspots in CockroachDB
+summary: Learn about the terminology and patterns of hotspots in CockroachDB. Learn about best practices in reducing hotspots.
toc: true
---
-In distributed SQL, hotspots refer to bottlenecks that limit a cluster's ability to scale efficiently within workloads. This page defines terminology and patterns for troubleshooting hotspots.
+In distributed SQL, hotspots refer to bottlenecks that limit a cluster's ability to scale efficiently within workloads. This page defines [terminology](#terminology) and [patterns](#patterns) for troubleshooting hotspots. These definitions are not mutually exclusive. They can be combined to describe a single incident.
-These definitions are not mutually exclusive. They can be combined to describe a single incident.
+The page also offers best practices for [reducing hotspots](#reduce-hotspots), including a [video demo](#video-demo).
## Terminology
@@ -173,11 +173,11 @@ SELECT MAX(created_at) FROM posts GROUP BY created_at ORDER BY created_at LIMIT
Lookback hotspots are unique because they are [hot by read](#read-hotspot), rather than [hot by write](#write-hotspot). Separately lookback hotspots also tend not to specify a key, which would evade systems using key requests to identify hotspots.
-#### Queuing hotspot
+#### Queueing hotspot
**Synonyms:** outbox hotspot
-A _queuing hotspot_ is a type of index hotspot that occurs when a workload treats CockroachDB like a distributed queue. This can happen if you implement the [Outbox microservice pattern]({% link {{ page.version.version }}/cdc-queries.md %}#queries-and-the-outbox-pattern).
+A _queueing hotspot_ is a type of index hotspot that occurs when a workload treats CockroachDB like a distributed queue. This can happen if you implement the [Outbox microservice pattern]({% link {{ page.version.version }}/cdc-queries.md %}#queries-and-the-outbox-pattern).
Queues, such as logs, generally require data to be ordered by write, which necessitates indexing in a way that is likely to create a hotspot. An outbox where data is deleted as it is read has an additional problem: it tends to accumulate an ordered set of [garbage data]({% link {{ page.version.version }}/operational-faqs.md %}#why-is-my-disk-usage-not-decreasing-after-deleting-data) behind the live data. Since the system cannot determine whether any live rows exist within the garbage data, what appears to be a small table scan to the user can actually result in an unexpectedly intensive scan on the garbage data.
@@ -323,8 +323,17 @@ A _tenant hotspot_ is a hotspot where one tenant's workload affects another tena
For example, consider a cluster with tenants A and B. Tenant A's workload generates a hotspot. Tenant B’s tables experience degradation on nodes where their data is colocated with Tenant A's hotspot. In this case, we say that Tenant B is experiencing a _tenant hotspot_.
+## Reduce hotspots
+
+{% include {{ page.version.version }}/performance/reduce-hotspots.md %}
+
+### Video demo
+
+For a demo on hotspot reduction, watch the following video:
+
+{% include_cached youtube.html video_id="j15k01NeNNA" %}
+
## See also
-- [SQL Performance Best Practices: Hotspots]({% link {{ page.version.version }}/performance-best-practices-overview.md %}#hot-spots)
-- [Performance Tuning Recipes: Hotspots]({% link {{ page.version.version }}/performance-recipes.md %}#hot-spots)
+- [Performance Tuning Recipes: Hotspots]({% link {{ page.version.version }}/performance-recipes.md %}#hotspots)
- [Single hot node]({% link {{ page.version.version }}/query-behavior-troubleshooting.md %}#single-hot-node)
\ No newline at end of file
|