applies_to | navigation_title | mapped_pages | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Watermark errors |
When a data node is critically low on disk space and has reached the flood-stage disk usage watermark, the following error is logged: Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block
.
To prevent a full disk, when a node reaches this watermark, {{es}} blocks writes to any index with a shard on the node. If the block affects related system indices, {{kib}} and other {{stack}} features may become unavailable. For example, this could induce {{kib}}'s Kibana Server is not Ready yet
error message.
{{es}} will automatically remove the write block when the affected node’s disk usage falls below the high disk watermark. To achieve this, {{es}} attempts to rebalance some of the affected node’s shards to other nodes in the same data tier.
::::{tip} If you're using {{ech}}, you can use AutoOps to monitor your cluster. AutoOps significantly simplifies cluster management with performance recommendations, resource utilization visibility, and real-time issue detection with resolution paths. For more information, refer to . ::::
To verify that shards are moving off the affected node until it falls below high watermark., use the cat shards API and cat recovery API:
GET _cat/shards?v=true
GET _cat/recovery?v=true&active_only=true
If shards remain on the node keeping it about high watermark, use the cluster allocation explanation API to get an explanation for their allocation status.
GET _cluster/allocation/explain
{
"index": "my-index",
"shard": 0,
"primary": false
}
To immediately restore write operations, you can temporarily increase disk watermarks and remove the write block.
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.watermark.low": "90%",
"cluster.routing.allocation.disk.watermark.low.max_headroom": "100GB",
"cluster.routing.allocation.disk.watermark.high": "95%",
"cluster.routing.allocation.disk.watermark.high.max_headroom": "20GB",
"cluster.routing.allocation.disk.watermark.flood_stage": "97%",
"cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": "5GB",
"cluster.routing.allocation.disk.watermark.flood_stage.frozen": "97%",
"cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": "5GB"
}
}
PUT */_settings?expand_wildcards=all
{
"index.blocks.read_only_allow_delete": null
}
When a long-term solution is in place, to reset or reconfigure the disk watermarks:
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.watermark.low": null,
"cluster.routing.allocation.disk.watermark.low.max_headroom": null,
"cluster.routing.allocation.disk.watermark.high": null,
"cluster.routing.allocation.disk.watermark.high.max_headroom": null,
"cluster.routing.allocation.disk.watermark.flood_stage": null,
"cluster.routing.allocation.disk.watermark.flood_stage.max_headroom": null,
"cluster.routing.allocation.disk.watermark.flood_stage.frozen": null,
"cluster.routing.allocation.disk.watermark.flood_stage.frozen.max_headroom": null
}
}
To resolve watermark errors permanently, perform one of the following actions:
- Horizontally scale nodes of the affected data tiers.
- Vertically scale existing nodes to increase disk space.
- Delete indices using the delete index API, either permanently if the index isn’t needed, or temporarily to later restore.
- update related ILM policy to push indices through to later data tiers
::::{tip}
On {{ech}} and {{ece}}, indices may need to be temporarily deleted via its Elasticsearch API Console to later snapshot restore in order to resolve cluster health status:red
which will block attempted changes. If you experience issues with this resolution flow on {{ech}}, kindly reach out to Elastic Support for assistance.
::::