Skip to content

Commit 6a396b9

Browse files
committed
Incorporated suggestions on headinds and subsections and rewording and clarifying section
1 parent 89e4274 commit 6a396b9

File tree

1 file changed

+24
-16
lines changed

1 file changed

+24
-16
lines changed

troubleshoot/elasticsearch/high-cpu-usage.md

+24-16
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps si
2525

2626
## Diagnose high CPU usage [diagnose-high-cpu-usage]
2727

28-
**Check CPU usage**
28+
### Check CPU usage [check-cpu-usage]
2929

3030
You can check the CPU usage per node using the [cat nodes API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-nodes):
3131

@@ -60,7 +60,7 @@ To track CPU usage over time, we recommend enabling monitoring:
6060
::::::
6161

6262
:::::::
63-
**Check hot threads**
63+
### Check hot threads [check-hot-threads]
6464

6565
If a node has high CPU usage, use the [nodes hot threads API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-nodes-hot-threads) to check for resource-intensive threads running on the node.
6666

@@ -75,16 +75,16 @@ This API returns a breakdown of any hot threads in plain text. High CPU usage fr
7575

7676
The following tips outline the most common causes of high CPU usage and their solutions.
7777

78-
### Check JVM garbage collection
78+
### Check JVM garbage collection [check-jvm-garbage-collection]
7979

8080
High CPU usage is often caused by excessive JVM garbage collection (GC) activity. This excessive GC typically arises from configuration problems or inefficient queries causing increased heap memory usage.
8181

8282
For optimal JVM performance, garbage collection should meet these criteria:
8383

84-
* Young GC completes quickly, ideally within 50 milliseconds.
85-
2. Young GC does not occur too frequently (approximately once every 10 seconds).
86-
3. Old GC completes quickly (ideally within 1 second).
87-
4. Old GC does not occur too frequently (once every 10 minutes or less frequently).
84+
| GC Type | Completion Time | Occurrence Frequency |
85+
|---------|----------------|---------------------|
86+
| Young GC | <50ms | ~once per 10 seconds |
87+
| Old GC | <1s |once per 10 minutes |
8888

8989
Excessive JVM garbage collection usually indicates high heap memory usage. Common potential reasons for increased heap memory usage include:
9090

@@ -95,33 +95,41 @@ Excessive JVM garbage collection usually indicates high heap memory usage. Commo
9595
* Improper heap size configuration
9696
* Misconfiguration of JVM new generation ratio (`-XX:NewRatio`)
9797

98-
**Hot spotting**
98+
### Hot spotting [high-cpu-usage-hot-spotting]
9999

100-
You might experience high CPU usage on specific data nodes or an entire [data tier](/manage-data/lifecycle/data-tiers.md) if traffic isn’t evenly distributed. This is known as [hot spotting](hotspotting.md). Hot spotting commonly occurs when read or write applications don’t properly balance requests across nodes, or when indices receiving heavy write activity, such as indices in the hot tier, have their shards concentrated on just one or a few nodes.
100+
You might experience high CPU usage on specific data nodes or an entire [data tier](/manage-data/lifecycle/data-tiers.md) if traffic isn’t evenly distributed. This is known as [hot spotting](hotspotting.md). Hot spotting commonly occurs when read or write applications don’t evenly distribute requests across nodes, or when indices receiving heavy write activity, such as indices in the hot tier, have their shards concentrated on just one or a few nodes.
101101

102102
For details on diagnosing and resolving these issues, refer to [](hotspotting.md).
103103

104-
**Oversharding**
104+
### Oversharding [high-cpu-usage-oversharding]
105105

106-
If your Elasticsearch cluster contains a large number of shards, you might be facing an oversharding issue.
107-
108-
Oversharding occurs when there are too many shards, causing each shard to be smaller than optimal. While Elasticsearch doesn’t have a strict minimum shard size, an excessive number of small shards can negatively impact performance. Each shard consumes cluster resources since Elasticsearch must maintain metadata and manage shard states across all nodes.
106+
Oversharding occurs when a cluster has too many shards, often times caused by shards being smaller than optimal. While Elasticsearch doesn’t have a strict minimum shard size, an excessive number of small shards can negatively impact performance. Each shard consumes cluster resources since Elasticsearch must maintain metadata and manage shard states across all nodes.
109107

110108
If you have too many small shards, you can address this by doing the following:
111109

112110
* Removing empty or unused indices.
113111
* Deleting or closing indices containing outdated or unnecessary data.
114112
* Reindexing smaller shards into fewer, larger shards to optimize cluster performance.
115113

114+
If your shards are sized correctly but you are still experiencing oversharding, creating a more aggressive [index lifecycle management strategy](/manage-data/lifecycle/index-lifecycle-management.md) or deleting old indices can help reduce the number of shards.
115+
116116
For more information, refer to [](/deploy-manage/production-guidance/optimize-performance/size-shards.md).
117117

118118
### Additional recommendations
119119

120120
To further reduce CPU load or mitigate temporary spikes in resource usage, consider these steps:
121121

122-
* **Scale your cluster**: Heavy indexing and search loads can deplete smaller thread pools. Add nodes or upgrade existing ones to handle increased indexing and search loads more effectively.
123-
* **Spread out bulk requests**: Submit smaller [bulk indexing](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk-1) or multi-search requests, and space them out to avoid overwhelming thread pools.
124-
* **Cancel long-running searches**: Regularly use the [task management API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-tasks-list) to identify and cancel searches that consume excessive CPU time.
122+
#### Scale your cluster [scale-your-cluster]
123+
124+
Heavy indexing and search loads can deplete smaller thread pools. Add nodes or upgrade existing ones to handle increased indexing and search loads more effectively.
125+
126+
#### Spread out bulk requests [spread-out-bulk-requests]
127+
128+
Submit smaller [bulk indexing](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk-1) or multi-search requests, and space them out to avoid overwhelming thread pools.
129+
130+
#### Cancel long-running searches [cancel-long-running-searches]
131+
132+
Regularly use the [task management API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-tasks-list) to identify and cancel searches that consume excessive CPU time.
125133

126134
```console
127135
GET _tasks?actions=*search&detailed

0 commit comments

Comments
 (0)