Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Post on OTel Metrics Visualizations #2841

Merged

Conversation

KarstenSchnitter
Copy link
Contributor

Description

Adds a post on visualizing OpenTelemetry metrics data as ingested by DataPrepper. It explains about different approaches and details of the underlying data model. It utilizes a concrete instrumentation example by the kubeletstatsreceiver from the OpenTelemetry Collector.

Issues Resolved

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Adds a post on visualizing OpenTelemetry metrics data
as ingested by DataPrepper. It explains about different
approaches and details of the underlying data model.
It utilizes a concrete instrumentation example by the
kubeletstatsreceiver from the OpenTelemetry Collector.

Signed-off-by: Karsten Schnitter <[email protected]>
@KarstenSchnitter
Copy link
Contributor Author

There are still two sections to complete. I will add them soon.

Provides a Data Table as selector for K8s namespaces showing
the number of pods and containers in them. This is a very basic
example to introduce the standard visualization configuration.

Signed-off-by: Karsten Schnitter <[email protected]>
Explain how to configure a line chart for a time series.
Show the limitations and give some remidiations.

Signed-off-by: Karsten Schnitter <[email protected]>
Fixes some spelling errors.

Signed-off-by: Karsten Schnitter <[email protected]>
@juergen-walter
Copy link
Contributor

KarstenSchnitter and others added 3 commits May 13, 2024 10:59
Apply suggestions from @juergen-walter.

Co-authored-by: Jürgen Walter <[email protected]>
Signed-off-by: Karsten Schnitter <[email protected]>
Improve the text following the remarks of the Github Action.

Signed-off-by: Karsten Schnitter <[email protected]>
More changes according to style-job.
Some annotations are left unresolved due to unknown words.

Signed-off-by: Karsten Schnitter <[email protected]>
@anirudha
Copy link
Contributor

anirudha commented Jun 7, 2024

@KarstenSchnitter
Copy link
Contributor Author

@KarstenSchnitter have you taken a look at/ https://github.com/opensearch-project/opentelemetry-demo/blob/main/tutorial/GettingStarted.md

Uploading otelEndtoEnd.mov…

@anirudha: I have seen that. I wanted to explain, how to build visualizations for custom metrics. I just picked the Kubelet Stats Receiver as an easy to reproduce example. I can extend the post by simple Vega visualizations, but I have a more difficult time to dive into the observability plugin. Let me know, what you prefer. If you do not want the article in this blog, I can take it somewhere else as well.

@pajuric
Copy link

pajuric commented Jun 14, 2024

@KarstenSchnitter - Checking on the status of this blog to see when the final draft might be ready for review? I see Ani is still contributing to it, but I was unsure of when you would like to publish.

@KarstenSchnitter
Copy link
Contributor Author

@pajuric From my side, this PR is ready for review. I would adjust the date before merging though.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KarstenSchnitter Editorial review complete. Please see my comments and changes and let me know if you have any questions. Apologies for the repeated comments re: code font, but I'm not used to seeing so many terms in quotation marks, and I wanted to be sure to call out every instance just in case. See the Code formatting checklist for help determining what we normally put in code font (as opposed to UI elements, which we normally put in bold). Thanks!

@pajuric This will be ready to publish once my comments/changes are addressed and @KarstenSchnitter is happy with the draft.

@@ -0,0 +1,421 @@
---
layout: post
title: "The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already published a blog post with this title (https://opensearch.org/blog/semantic-science-benchmarks/). Is this supposed to be part of a series?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing. I will fix the title together with the date.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we replace the title, please ensure that it's in sentence case (as it is currently) 😄

Vega lets us use any OpenSearch query and provides a grammar to transform and visualize the data with [D3.js](https://d3js.org/).
It allows to access global filters and the time interval selector to create well-integrated exploration and analysis journeys in OpenSearch.

Of course, this has a steeper learning curve.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine these two sentences?: "Vega may have a steeper learning curve, and explaining Vega visualizations would warrant a separate blog post."

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Karsten Schnitter <[email protected]>
Copy link
Contributor Author

@KarstenSchnitter KarstenSchnitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natebower many thanks for your very detailed review. I integrated all your direct recommendations. I will work on the other comments tomorrow.

The _metrics_ support in OpenSearch is geared towards the integration of an external Prometheus data source.
Analyzing metrics ingested with DataPrepper requires custom built visualizations.
This blog post is all about these visualizations and the underlying data model.
It is built on the experience in [SAP Cloud Logging service](https://discovery-center.cloud.sap/serviceCatalog/cloud-logging), where I created per-defined dashboards for the customers.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the blog post. Maybe I change "it" to "this post"?

For our example, this would look like `{ "container.cpu.time": 120.353905}`.
However, if we ingested a lot of different metrics, we would reach the field limit of 1000 per index pretty fast.

There is additional information about the kind, that introduces semantics to the time series.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind is the type of metric. In the example is is SUM indicating a counter-like metric. This comes with some semantic implications. For example SUMs can be added during aggregations over different dimensions while GAUGEs would need to be averaged.


There is additional information about the kind, that introduces semantics to the time series.
In the example we deal with a "SUM" (a counter), which is monotonic.
The "AGGREGATION_TEMPORALITY_CUMULATIVE" tells us, that the `value` will contain the current count started at `startTime`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the value of the aggregation temporality. Since this is such a big term, I did not want to repeat it.

There is additional information about the kind, that introduces semantics to the time series.
In the example we deal with a "SUM" (a counter), which is monotonic.
The "AGGREGATION_TEMPORALITY_CUMULATIVE" tells us, that the `value` will contain the current count started at `startTime`.
The alternative "AGGREGATION_TEMPORALITY_DELTA" would contain only the change encountered between `startTime` and `time` with non-overlapping intervals.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. It is hard to find the balance between crisp text here and explaining the OTel data model.

![TSVB Data Positive Rate](/assets/media/blog-images/2024-05-07-opentelemetry-metrics-visualization/tsvb_data-positive-rate.png){:class="img-centered"}

We extend the aggregations by a derivative by unit 1s.
This calculates the time share spent by the CPU on that pod.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is acutally the ratio of CPU time spent on that pod over the total CPU time spent. That is what I wanted to indicate with "share".

We have now created two different visualizations of pod CPU utilization.
We can save our visualizations at any point in time and add them to dashboards, that combine multiple visualizations.
TSVB allows to configure, whether global filters should be respected or not.
This allows to embed the visualizations into larger scenarios sharing filters.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Larger scenarios refers to a collection of dashboards.


![New Data Table Visualization](/assets/media/blog-images/2024-05-07-opentelemetry-metrics-visualization/visualize_create_data-table.png){:class="img-centered"}

We are asked to select the index or saved search as data source.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use "the".


On the right, we can see, how the number of pods is calculated: It is the unique count of values in field `resource.attributes.k8s@pod@name`, which contains the pod name.
We could have used the pod id as well, but pod names contain a unique suffix, so this is semantically identical.
Note, that the pod name is filled in by the Kubelet Stats Receiver as resource attribute.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIll use "a".


Back in the "Data" tab, we could also opt to create a "missing values" bucket.
This can cost some query performance.
We can always check the query performance with the inspect dialog looking at "Requests" and the "Response" tab.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think inspect is similar to Requests and Response, since it is a UI element.

Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KarstenSchnitter Thanks so much! I replied to your comments and will be available tomorrow for any further questions.

![TSVB Data Positive Rate](/assets/media/blog-images/2024-05-07-opentelemetry-metrics-visualization/tsvb_data-positive-rate.png){:class="img-centered"}

We extend the aggregations by a derivative by unit 1s.
This calculates the time share spent by the CPU on that pod.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, would "the share of time spent" work?

@@ -0,0 +1,421 @@
---
layout: post
title: "The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we replace the title, please ensure that it's in sentence case (as it is currently) 😄

KarstenSchnitter and others added 2 commits July 12, 2024 15:36
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: Karsten Schnitter <[email protected]>
Provide correct title and set hopefully realistic publish date.

Signed-off-by: Karsten Schnitter <[email protected]>
@KarstenSchnitter
Copy link
Contributor Author

@natebower I tried to address all your comments and come up with a hopefully realistic publishing date. I can see no more unresolved comments for now. Please let me know, if there is anything more, that needs to be changed.

@natebower
Copy link
Collaborator

@natebower I tried to address all your comments and come up with a hopefully realistic publishing date. I can see no more unresolved comments for now. Please let me know, if there is anything more, that needs to be changed.

@KarstenSchnitter Awesome, thanks for all of your work on this! Confirmed that it looks as though all comments are resolved, so you should be ready for @pajuric.

@KarstenSchnitter
Copy link
Contributor Author

@natebower would you give this an approving review, or is this for @pajuric to do?

@pajuric
Copy link

pajuric commented Jul 12, 2024

@KarstenSchnitter - I am working on adding the meta and a publish date. Likely, we will publish early next week, but I need to confirm.

categories:
- technical-post
meta_keywords:
meta_description:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KarstenSchnitter - Please review and edit or add this meta to your blog:

meta_keywords: SAP Cloud Logging, Data Prepper ingestion, OpenTelemetry data models, OpenSearch Dashboards, observability plugin

meta_description: Discover how SAP Cloud Logging created custom-built visualizations for analyzing metrics using the OpenTelemetry and Data Prepper ingestion tools in OpenSearch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine.

@pajuric
Copy link

pajuric commented Jul 15, 2024

Blog is tentatively scheduled to publish tomorrow, July 16, 2024.

Signed-off-by: Karsten Schnitter <[email protected]>
@KarstenSchnitter
Copy link
Contributor Author

I added the proposed meta-data. Thanks everybody for your help.

@pajuric
Copy link

pajuric commented Jul 15, 2024

@krisfreedain @nateynateynate - This blog is ready for publishing on Tuesday, July 16.

Copy link
Member

@krisfreedain krisfreedain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great - thank you @KarstenSchnitter

@krisfreedain krisfreedain merged commit c83f9da into opensearch-project:main Jul 16, 2024
5 checks passed
@KarstenSchnitter KarstenSchnitter deleted the otel-metrics-vis branch July 16, 2024 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants