One common task in a log-based Elasticsearch infrastructure is to keep logs for a certain amount of time, e.g. 30 days. The Curator can be used to automatically delete older indices.
In case the XPack functionality is available, check out the index lifecycle management documentation of Elasticsearch on how to use index lifecycle policies.
Set up the config.yml
to communicate with the Elasticsearch cluster. Check the example config.yml
on how to set up the configuration file.
The Curator provides a delete_indices
action to delete indices, see Curator documentation for more details. The basic structure of this action is:
action: delete_indices
description: "Delete selected indices"
options:
timeout_override: 300
continue_if_exception: False
filters:
- filtertype: ...
One important property is the filters list that can take a number of filtertypes
fields. Check the age and pattern filters.
The pattern
filter type allows to find indices that match the given string pattern, e.g. logstash-
. The age
filter type matches the indices based on their age. The age can be specified in multiple ways, e.g. name based, creation based, etc.
Specifying multiple filter types requires that indices match all of them. In this way indices can be filtered out.
Follow these steps to run the excercise
✅ Start Elasticsearch
To make this example visible, first we create at least 3 daily indices with a date based suffix, e.g. logstash-2021.01.05
, logstash-2021.01.04
, using the Create Index API.
✅ Create three logstash-*
prefixed indices with daily dates
# curl -X PUT 'http://localhost:9200/<logstash-{now/d-2d}>'
curl -X PUT 'http://localhost:9200/%3Clogstash-%7Bnow%2Fd-2d%7D%3E'
# curl -X PUT 'http://localhost:9200/<logstash-{now/d-1d}>'
curl -X PUT 'http://localhost:9200/%3Clogstash-%7Bnow%2Fd-1d%7D%3E'
# curl -X PUT 'http://localhost:9200/<logstash-{now/d}>'
curl -X PUT 'http://localhost:9200/%3Clogstash-%7Bnow%2Fd%7D%3E'
Elasticsearch supports Date math for index names. For example the index name
/<logstash-{now/d-2d}>
references an index namedlogstash-2021.01.17
from two days ago (assuming it's 2021.01.19 when used). The pair of<
,>
marks the template part of the index, while the function{now/d-2d}
creates the2021.01.19
and subtracts two days from it. The index name has to be URL encoded to conform to the URL format.
✅ Create or check the config.yml
(see Configuration section)
Check the delete_indices action reference in the Curator documentation to see how an action
configuration looks.
✅ Create an actions file named delete_indices.yml
that deletes logstash-*
prefixed indices that are older than 2
days
Once both config and action files are created, let's run the curator in dry mode
✅ Run the curator
on the command line --dry-mode
curator --dry-run --config config.yml delete_indices.yml
Check the output of this command, what does it do?
✅ Run curator
command to delete older indices
curator --config config.yml delete_indices.yml
When successful, there should only be two indices starting with the same suffix, e.g. logstash-*
, older indices should have been deleted.
Depending on your chosen values in the age filter type no indices may have been deleted. What is the difference between
source: creation_date
andsource: timestring
?
Here is a version of the action.yml
to delete older indices.
Possible Solution
# delete_indices.yml
---
actions:
1:
action: delete_indices
description: >-
Delete indices older than 2 days (based on timestring pattern), for logstash-
prefixed indices. Ignore the error if the filter does not result in an actionable list
of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 2
Mulitple action files can be used separately. This way separate indices or index groups can be managed, for example different applications, users etc.