Skip to content

[Observability] Add failure store documentation #699

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 41 additions & 16 deletions solutions/observability/data-set-quality-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,25 @@ applies_to:
stack: beta
serverless: beta
navigation_title: "Data set quality"
applies_to:
stack: beta
serverless: beta
---

# Data set quality monitoring [observability-monitor-datasets]

The **Data Set Quality** page provides an overview of your log, metric, trace, and synthetic data sets. Use this information to get an idea of your overall data set quality and find data sets that contain incorrectly parsed documents.

To open **Data Set Quality**, find **Stack Management** in the main menu or use the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). By default, the page only shows log data sets. To see other data set types, select them from the **Type** menu.
::::{note}:

::::{admonition} Requirements
:class: note
**Required roles and privileges**

Users with the `viewer` role can view the Data Sets Quality summary. To view the Active Data Sets and Estimated Data summaries, users need the `monitor` [index privilege](../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices) for the `logs-*-*` index.
With the `viewer` role, users can view the Data Sets Quality summary.

You need the `monitor` [index privilege](../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices) for the `logs-*-*` index to view the Active Data Sets and Estimated Data summaries on the Data set quality page.
::::

The **Data Set Quality** page provides an overview of your log, metric, trace, and synthetic data sets. Use this information to get an idea of your overall data set quality and find data sets that contain incorrectly parsed documents.

To open **Data Set Quality**, find **Stack Management** in the main menu or use the [global search field](/explore-analyze/find-and-organize/find-apps-and-objects.md). By default, the page only shows log data sets. To see other data set types, select them from the **Type** menu.

The quality of your data sets is based on the percentage of degraded documents in each data set. A degraded document in a data set contains the [`_ignored`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-ignored-field.md) property because one or more of its fields were ignored during indexing. Fields are ignored for a variety of reasons. For example, when the [`ignore_malformed`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-ignored-field.md) parameter is set to true, if a document field contains the wrong data type, the malformed field is ignored and the rest of the document is indexed.

Expand All @@ -32,30 +36,51 @@ From the data set table, you’ll find information for each data set such as its

Opening the details of a specific data set shows the degraded documents history, a summary for the data set, and other details that can help you determine if you need to investigate any issues.


## Investigate issues [observability-monitor-datasets-investigate-issues]

The Data Set Quality page has a couple of different ways to help you find ignored fields and investigate issues. From the data set table, you can open the data set’s details page, and view commonly ignored fields and information about those fields. Open a logs data set in Logs Explorer or other data set types in Discover to find ignored fields in individual documents.
The Data Set Quality page provides several ways to help you investigate issues. From the data set table, you can open the data set’s details page, open failed docs sent to the failure store in Discover, and view ignored fields.

### Failure store
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a documentation from es showing more information about the failure store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's in progress, but once there is, I will link to that for more information.


::::{note}:

**Required privileges**

You need the `read_failure_store` or `all` [index privilege](../../deploy-manage/users-roles/cluster-or-deployment-auth/elasticsearch-privileges.md#privileges-list-indices) to access failure store.

::::

To help diagnose issues with ingestion or mapping, documents that are rejected during ingestion are sent to a dedicated data stream called failure store. From the Data Set Quality page, data streams with documents in the failure store will show a percentage in the **Failed docs (%)**. The failed docs percentage gives you a quick overview of the magnitude of potential problems in your ingestion process.

Select the percentage for a specific data stream to open Discover and see the raw documents that were sent to failure store.

% screenshot.

To diagnose issues in a specific data stream:
1. Select the data set name from the main table.
1. Open **failed documents**.

% screenshot

### Find ignored fields in data sets [observability-monitor-datasets-find-ignored-fields-in-data-sets]

To open the details page for a data set with poor or degraded quality and view ignored fields:
To open the details page for a data set with poor or degraded quality and view ignored fields and failed documents:

1. From the data set table, click ![expand icon](/solutions/images/serverless-expand.svg "") next to a data set with poor or degraded quality.
2. From the details, scroll down to **Quality issues**.
1. From the data set table, select a data set name.
2. Scroll down to **Quality issues**.

The **Quality issues** section shows fields that have been ignored, the number of documents that contain ignored fields, and the timestamp of last occurrence of the field being ignored.
The **Quality issues** section shows fields that have been ignored, the number of documents that contain ignored fields, the timestamp of last occurrence of the field being ignored, and failed documents.

% Screenshot

### Find ignored fields in individual logs [observability-monitor-datasets-find-ignored-fields-in-individual-logs]

To use Logs Explorer or Discover to find ignored fields in individual logs:
To use Discover to find ignored fields in individual logs:

1. Find data sets with degraded documents using the **Degraded Docs** column of the data sets table.
2. Click the percentage in the **Degraded Docs** column to open the data set in Logs Explorer or Discover.
1. From the Data Set Quality page, use the **Degraded Docs** column to find data sets with degraded documents.
2. Select the percentage in the **Degraded Docs** column to open the data set in Discover.

The **Documents** table in Logs Explorer or Discover is automatically filtered to show documents that were not parsed correctly. Under the **actions** column, you’ll find the degraded document icon (![degraded document icon](/solutions/images/serverless-indexClose.svg "")).
The **Documents** table in Discover is automatically filtered to show documents that were not parsed correctly. You’ll find the degraded document icon (![degraded document icon](/solutions/images/serverless-indexClose.svg "")) next to documents that weren't parsed correctly. You can also go directly to Discover and look for this icon to find documents that weren't parsed correctly.

Now that you know which documents contain ignored fields, examine them more closely to find the origin of the issue:

Expand Down
Loading