In this tutorial, you will learn how to install and configure the Prometheus
stack, to monitor all pods from your DOKS
cluster, as well as Kubernetes
cluster state metrics. Then, you will connect Prometheus
with Grafana
to visualize all metrics, and perform queries using the PromQL
language. Finally, you will configure persistent
storage for your Prometheus
instance, to persist all your DOKS
cluster and application metrics data.
Why choose Prometheus
?
Prometheus
supports multidimensional data collection
and data queuing
. It's reliable and allows you to quickly diagnose problems. Since each server is independent, it can be leaned on when other infrastructure is damaged, without requiring additional infrastructure. It also integrates very well with Kubernetes
, and that's a big plus as well.
Prometheus
follows a pull
model when it comes to metrics gathering, meaning that it expects a /metrics
endpoint to be exposed by the service in question for scraping.
After finishing this tutorial, you will be able to:
Configure
monitoring for all pods running in yourDOKS
clusterVisualize
metrics for yourapplications
in real time, usingGrafana
- Configure
ServiceMonitors
via thePrometheus Operator
, for your services (e.g.Ambassador Edge Stack
) - Use
PromQL
to perform queries on metrics. - Configure
persistent
storage forPrometheus
, to safely store all yourDOKS
cluster andapplication
metrics. - Configure
persistent
storage forGrafana
, to safely store all yourdashboards
.
- Introduction
- Prerequisites
- Step 1 - Installing the Prometheus Stack
- Step 2 - Configure Prometheus and Grafana
- Step 3 - PromQL (Prometheus Query Language)
- Step 4 - Visualizing Metrics Using Grafana
- Step 5 - Configuring Persistent Storage for Prometheus
- Step 6 - Configuring Persistent Storage for Grafana
- Conclusion
To complete this tutorial, you will need:
- A Git client, to clone the
Starter Kit
repository. - Helm, for managing
Promtheus
stack releases and upgrades. - Kubectl, for
Kubernetes
interaction. - Curl, for testing the examples (backend applications).
Please make sure that kubectl
context is configured to point to your Kubernetes
cluster - refer to Step 3 - Creating the DOKS Cluster from the DOKS
setup tutorial.
In this step, you will install the kube-prometheus
stack, which is an opinionated full monitoring stack for Kubernetes
. It includes the Prometheus Operator
, kube-state-metrics
, pre-built manifests, Node Exporters
, Metrics API
, the Alerts Manager
and Grafana
.
You're going to use the Helm
package manager, to accomplish this task. Helm
chart is available here for study.
Steps to follow:
-
First, clone the
Starter Kit
repository and change directory to your local copy. -
Next, add the
Helm
repository and list the available charts:helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update prometheus-community helm search repo prometheus-community
The output looks similar to the following:
NAME CHART VERSION APP VERSION DESCRIPTION prometheus-community/alertmanager 0.14.0 v0.23.0 The Alertmanager handles alerts sent by client ... prometheus-community/kube-prometheus-stack 30.0.1 0.53.1 kube-prometheus-stack collects Kubernetes manif... ...
Note:
The chart of interest is
prometheus-community/kube-prometheus-stack
, which will installPrometheus
,Promtail
,Alertmanager
andGrafana
on the cluster. Please visit the kube-prometheus-stack page for more details about this chart. -
Then, open and inspect the
04-setup-prometheus-stack/assets/manifests/prom-stack-values-v30.0.1.yaml
file provided in theStarter Kit
repository, using an editor of your choice (preferably withYAML
lint support). By default,kubeSched
andetcd
metrics are disabled - those components are managed byDOKS
and are not accessible toPrometheus
. Note thatstorage
is set toemptyDir
. It means the storage will be gone ifPrometheus
pods restart (you will fix this later on, in the Configuring Persistent Storage for Prometheus section). -
Finally, install the
kube-prometheus-stack
, usingHelm
:HELM_CHART_VERSION="30.0.1" helm install kube-prom-stack prometheus-community/kube-prometheus-stack --version "${HELM_CHART_VERSION}" \ --namespace monitoring \ --create-namespace \ -f "04-setup-prometheus-stack/assets/manifests/prom-stack-values-v${HELM_CHART_VERSION}.yaml"
Note:
A
specific
version for theHelm
chart is used. In this case30.0.1
was picked, which maps to the0.53.1
version of the application (see output fromStep 2.
). It’s good practice in general, to lock on a specific version. This helps to have predictable results, and allows versioning control viaGit
.
Now, check the Prometheus
stack Helm
release status:
helm ls -n monitoring
The output looks similar to (notice the STATUS
column value - it should say deployed
):
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
kube-prom-stack monitoring 1 2022-01-10 11:29:29.463468 +0200 EET deployed kube-prometheus-stack-30.0.1 0.53.1
See what Kubernetes
resources are available for Prometheus
:
kubectl get all -n monitoring
You should have the following resources deployed: prometheus-node-exporter
, kube-prome-operator
, kube-prome-alertmanager
, kube-prom-stack-grafana
and kube-state-metrics
. The output looks similar to:
NAME READY STATUS RESTARTS AGE
pod/alertmanager-kube-prom-stack-kube-prome-alertmanager-0 2/2 Running 0 3m3s
pod/kube-prom-stack-grafana-8457cd64c4-ct5wn 2/2 Running 0 3m5s
pod/kube-prom-stack-kube-prome-operator-6f8b64b6f-7hkn7 1/1 Running 0 3m5s
pod/kube-prom-stack-kube-state-metrics-5f46fffbc8-mdgfs 1/1 Running 0 3m5s
pod/kube-prom-stack-prometheus-node-exporter-gcb8s 1/1 Running 0 3m5s
pod/kube-prom-stack-prometheus-node-exporter-kc5wz 1/1 Running 0 3m5s
pod/kube-prom-stack-prometheus-node-exporter-qn92d 1/1 Running 0 3m5s
pod/prometheus-kube-prom-stack-kube-prome-prometheus-0 2/2 Running 0 3m3s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 3m3s
service/kube-prom-stack-grafana ClusterIP 10.245.147.83 <none> 80/TCP 3m5s
service/kube-prom-stack-kube-prome-alertmanager ClusterIP 10.245.187.117 <none> 9093/TCP 3m5s
service/kube-prom-stack-kube-prome-operator ClusterIP 10.245.79.95 <none> 443/TCP 3m5s
service/kube-prom-stack-kube-prome-prometheus ClusterIP 10.245.86.189 <none> 9090/TCP 3m5s
service/kube-prom-stack-kube-state-metrics ClusterIP 10.245.119.83 <none> 8080/TCP 3m5s
service/kube-prom-stack-prometheus-node-exporter ClusterIP 10.245.47.175 <none> 9100/TCP 3m5s
service/prometheus-operated ClusterIP None <none> 9090/TCP 3m3s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/kube-prom-stack-prometheus-node-exporter 3 3 3 3 3 <none> 3m5s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/kube-prom-stack-grafana 1/1 1 1 3m5s
deployment.apps/kube-prom-stack-kube-prome-operator 1/1 1 1 3m5s
deployment.apps/kube-prom-stack-kube-state-metrics 1/1 1 1 3m5s
NAME DESIRED CURRENT READY AGE
replicaset.apps/kube-prom-stack-grafana-8457cd64c4 1 1 1 3m5s
replicaset.apps/kube-prom-stack-kube-prome-operator-6f8b64b6f 1 1 1 3m5s
replicaset.apps/kube-prom-stack-kube-state-metrics-5f46fffbc8 1 1 1 3m5s
NAME READY AGE
statefulset.apps/alertmanager-kube-prom-stack-kube-prome-alertmanager 1/1 3m3s
statefulset.apps/prometheus-kube-prom-stack-kube-prome-prometheus 1/1 3m3s
Then, you can connect to Grafana
(using default credentials: admin/prom-operator
- see prom-stack-values-v30.0.1 file), by port forwarding to local machine:
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80
Important Note:
You should NOT expose Grafana
to public
network (eg. create an ingress mapping or LB
service) with default login/password
.
Grafana
installation comes with a number of dashboards. Open a web browser on localhost:3000. Once in, you can go to Dashboards -> Manage
, and choose different dashboards.
In the next part, you will discover how to set up Prometheus
to discover targets for monitoring. As an example, the Ambassador Edge Stack
will be used. You'll learn what a ServiceMonitor
is, as well.
You already deployed Prometheus
and Grafana
into the cluster. In this step, you will learn how to use a ServiceMonitor
. A ServiceMonitor
is one of the preferred ways to tell Prometheus
how to discover a new target for monitoring.
The Ambassador Edge Stack Deployment created earlier in the tutorial, provides the /metrics
endpoint by default on port 8877
via a Kubernetes
service.
Next, you will discover the Ambassador
service responsible with exposing metrics data for Prometheus
to consume. The service in question is called edge-stack-admin
(note that it's using the ambassador
namespace):
kubectl get svc -n ambassador
The output looks similar to the following:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
edge-stack LoadBalancer 10.245.39.13 68.183.252.190 80:31499/TCP,443:30759/TCP 3d3h
edge-stack-admin ClusterIP 10.245.68.14 <none> 8877/TCP,8005/TCP 3d3h
edge-stack-redis ClusterIP 10.245.9.81 <none> 6379/TCP 3d3h
Next, please perform a port-forward
, to inspect the metrics:
kubectl port-forward svc/ambassador-admin 8877:8877 -n ambassador
The exposed metrics
can be visualized
using the web browser on localhost, or using curl
:
curl -s http://localhost:8877/metrics
The output looks similar to the following:
# TYPE envoy_cluster_assignment_stale counter
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_127_0_0_1_8500_ambassador"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_127_0_0_1_8877_ambassador"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_echo_backend_ambassador"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_extauth_127_0_0_1_8500_ambassador"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_quote_backend_ambassador"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="cluster_quote_default_default"} 0
envoy_cluster_assignment_stale{envoy_cluster_name="xds_cluster"} 0
Next, connect Prometheus
to the Ambassador
metrics service. There are several ways of doing this:
- <static_config> - allows specifying a list of targets and a common label set for them.
- <kubernetes_sd_config> - allows retrieving scrape targets from
Kubernetes' REST API
and always staying synchronized with the cluster state. - Prometheus Operator - simplifies
Prometheus
monitoring inside aKubernetes
cluster viaCRDs
.
As you can see, there are many ways to tell Prometheus
to scrape an endpoint, so which one should you pick? The preferred way when targeting a Kubernetes
cluster, is to use the Prometheus Operator
which comes bundled with the Prometheus
monitoring stack.
Next, you will make use of the ServiceMonitor
CRD exposed by the Prometheus Operator
, to define a new target for monitoring.
Steps required to add the Ambassador
service, for Prometheus
to monitor:
-
First, change directory (if not already) where the
Starter Kit
Git repository was cloned:cd Kubernetes-Starter-Kit-Developers
-
Next, open the
04-setup-prometheus-stack/assets/manifests/prom-stack-values-v30.0.1.yaml
file provided in theStarter Kit
repository, using a text editor of your choice (preferably withYAML
lint support). Please remove the comments surrounding theadditionalServiceMonitors
section. The output looks similar to:additionalServiceMonitors: - name: "ambassador-monitor" selector: matchLabels: service: "ambassador-admin" namespaceSelector: matchNames: - ambassador endpoints: - port: "ambassador-admin"
Explanations for the above configuration:
selector -> matchLabels
- tellsServiceMonitor
what service to monitor.namespaceSelector
- here, you want to match the namespace whereAmbassador Edge Stack
was deployed.endpoints -> port
- references the port of the service to monitor.
-
Finally, apply changes using
Helm
:HELM_CHART_VERSION="30.0.1" helm upgrade kube-prom-stack prometheus-community/kube-prometheus-stack --version "${HELM_CHART_VERSION}" \ --namespace monitoring \ -f "04-setup-prometheus-stack/assets/manifests/prom-stack-values-v${HELM_CHART_VERSION}.yaml"
Next, please check if the Ambassador
target is added to Prometheus
for scraping. Create a port forward for Prometheus
on port 9090
:
kubectl port-forward svc/kube-prom-stack-kube-prome-prometheus 9090:9090 -n monitoring
Then, navigate to Status -> Targets
page, and inspect the results (notice the serviceMonitor/monitoring/ambassador-monitor/0
path):
Note:
There are 3 entries under the discovered target because the AES
deployment consists of 3 Pods
. Verify it via:
kubectl get deployments -n ambassador
The output looks similar to the following (notice the ambassador
line):
NAME READY UP-TO-DATE AVAILABLE AGE
edge-stack 3/3 3 3 7h17m
edge-stack-agent 1/1 1 1 7h17m
edge-stack-redis 1/1 1 1 7h17m
In the next step, you'll discover PromQL
along with some simple examples, to get you started, and discover the language.
In this step, you will learn the basics of Prometheus Query Language
(PromQL). PromQL
helps you perform queries on various metrics
coming from all Pods
and applications
from your DOKS
cluster.
What is PromQL
?
It's a DSL
or Domain Specific Language
that is specifically built for Prometheus
and allows you to query for metrics. Because it’s a DSL
built upon Go
, you’ll find that PromQL
has a lot in common with the language. But it’s also a NFL
or Nested Functional Language
, where data appears as nested expressions within larger expressions. The outermost, or overall, expression defines the final value, while nested expressions represent values for arguments and operands. For more in depth explanations, please visit the official PromQL page.
Next, you're going to inspect one of the Ambassador Edge
metrics, namely the ambassador_edge_stack_promhttp_metric_handler_requests_total
, which represents the total of HTTP
requests Prometheus
performed for the AES
metrics endpoint.
Steps to follow:
-
First, create a port forward for
Prometheus
on port9090
:kubectl port-forward svc/kube-prom-stack-kube-prome-prometheus 9090:9090 -n monitoring
-
Next, open the expression browser.
-
In the query input field paste
ambassador_edge_stack_promhttp_metric_handler_requests_total
, and hitEnter
. The output looks similar to:ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.196:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-k6q4v", service="ambassador-admin"} 21829 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.228:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-8v9nn", service="ambassador-admin"} 21829 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.32:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-rlqwm", service="ambassador-admin"} 21832 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="500", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.196:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-k6q4v", service="ambassador-admin"} 0 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="500", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.228:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-8v9nn", service="ambassador-admin"} 0 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="500", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.32:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-rlqwm", service="ambassador-admin"} 0 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="503", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.196:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-k6q4v", service="ambassador-admin"} 0 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="503", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.228:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-8v9nn", service="ambassador-admin"} 0 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="503", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.32:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-rlqwm", service="ambassador-admin"} 0
-
PromQL
groups similar data in what's called avector
. As seen above, eachvector
has a set ofattributes
which differentiates it from one another. What you can do then, is to group results based on an attribute of interest. For example, if you care only aboutHTTP
requests that ended with a200
response code, then please type the following in the query field:ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200"}
The output looks similar to (note that it selects only the results that match your criteria):
ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.196:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-k6q4v", service="ambassador-admin"} 21843 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.228:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-8v9nn", service="ambassador-admin"} 21843 ambassador_edge_stack_promhttp_metric_handler_requests_total{code="200", container="ambassador", endpoint="ambassador-admin", instance="10.244.0.32:8877", job="ambassador-admin", namespace="ambassador", pod="ambassador-bcb5b8d67-rlqwm", service="ambassador-admin"} 21845
Note:
The above result shows the total requests for each Pod
from the AES
deployment (which consists of 3
, as seen in the kubectl get deployments -n ambassador
command output). Each Pod
is exposing the same /metrics
endpoint, and the Kubernetes
service makes sure that the requests are distributed to each Pod
. Numbers at the end of each line represent the total HTTP
requests, so you can see that is roughly the same: 21843
, 21843
, 21845
. This demonstrates the Round Robin
method being used by the service.
This is just a very simple introduction to what PromQL
is and what it's capable of. But it can do much more than that, like: counting
metrics, computing the rate
over a predefined interval
, etc. Please visit the official PromQL page, for more features of the language.
In the next step, you will learn ho to use Grafana
to visualize
metrics for one of the Starter Kit
components - the Ambassador Edge Stack
.
Although Prometheus
has some support for visualizing
data built in, a better way of doing it is via Grafana
which is an open-source platform for monitoring
and observability
, that lets you visualize
and explore
the state
of your systems
.
On the official page is described as being able to:
Query, visualize, alert on, and understand your data no matter where it’s stored.
Why use Grafana
?
Because it's the leading open source monitoring and analytics platform available nowadays for visualizing data coming from a vast number of data sources, including Prometheus
as well. It offers some advanced features for organizing the graphs and it supports real time testing for queries. Not to mention that you can customize the views and make some beautiful panels which can be rendered on big screens so you never miss a single data point.
No extra steps are needed for installation, because Step 1 - Installing the Prometheus Stack installed Grafana
for you. All you have to do is a port forward like below, and get immediate access to the dashboards (default credentials: admin/prom-monitor
):
kubectl --namespace monitoring port-forward svc/kube-prom-stack-grafana 3000:80
In order to see all the Ambassador Edge Stack
metrics, you're going to add this well-designed dashboard from the Grafana
community.
Creating the Ambassador
dashboard for Grafana
:
- First, navigate to the dashboard import section (or hover the mouse on the
+
sign from the left pane, then clickImport
). - Next, paste this ID:
4698
in theImport via grafana.com
field. Then, clickLoad
. - Finally, select a data source -
Prometheus
, then hit theImport
button.
The picture down below shows the available options:
Explanations for the above Dashboard
import window:
Name
- the dashboard name (defaults toAmbassador
).Folder
- the folder name where to store this dashboard (defaults toGeneral
).Prometheus
- thePrometheus
instance to use (you have only one in this example).Listener port
- theEnvoy
listener port (defaults to8080
).
After clicking Import
, it will create the following dashboard, as seen below:
In the next step, you're going to monitor the number of API
calls for the quote
backend service created using the Ambassador Edge Stack Backend Services step, from the Ambassador Edge Stack Starter Kit
tutorial (or Nginx Backend Services, for Nginx
).
The graph of interest is: API Response Codes
.
If you call the service 2
times, you will see 4
responses being plotted. This is normal behavior, because the API Gateway
(from the Ambassador Edge Stack
) is hit first, and then the real service. Same thing happens when a reply is being sent back, so we have a total of: 2 + 2 = 4
responses being plotted in the API Response Codes
graph.
CLI
command used for testing the above scenario:
curl -Lk https://quote.starter-kit.online/quote/
The output looks similar to the following:
{
"server": "buoyant-pear-girnlk37",
"quote": "A small mercy is nothing at all?",
"time": "2021-08-11T18:18:56.654108372Z"
}
You can play around and add more panels in Grafana
, for visualizing other data sources, as well as group
them together based on scope
. Also, you can explore the available dashboards for Kubernetes
from the Grafana kube-mixin project.
In the next step, you will configure persistent
storage for Prometheus
using DigitalOcean
block storage, to persist your DOKS
and application metrics
across server restarts
or cluster failures
.
In this step, you will learn how to enable persistent storage
for Prometheus
, so that metrics data is persisted across server restarts
, or in case of cluster failures
. You will define a 5 Gi Persistent Volume Claim
(PVC), using the DigitalOcean Block Storage
. Later on, a quick and easy guide is provided, on how to plan
the size
of your PVC
, to suit your monitoring storage
needs. To learn more about PVCs
, please consult the Persistent Volumes page from the official Kubernetes
documentation.
Steps to follow:
-
First, check what storage class is available - you need one, in order to proceed:
kubectl get storageclass
The output should look similar to (notice that
DigitalOcean Block Storage
is available for you to use):NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE do-block-storage (default) dobs.csi.digitalocean.com Delete Immediate true 4d2h
-
Next, change directory (if not already) where the
Starter Kit
Git repository was cloned:cd Kubernetes-Starter-Kit-Developers
-
Then, open the
04-setup-prometheus-stack/assets/manifests/prom-stack-values-v30.0.1.yaml
file provided in theStarter Kit
repository, using a text editor of your choice (preferably withYAML
lint support). Search for thestorageSpec
line, and uncomment the required section forPrometheus
. ThestorageSpec
definition should look like:prometheusSpec: storageSpec: volumeClaimTemplate: spec: storageClassName: do-block-storage accessModes: ["ReadWriteOnce"] resources: requests: storage: 5Gi
Explanations for the above configuration:
volumeClaimTemplate
- defines a newPVC
.storageClassName
- defines the storage class (should use the same value as from thekubectl get storageclass
command output).resources
- sets the storage requests value - in this case, a total capacity of5 Gi
is requested for the new volume.
-
Finally, apply settings using
Helm
:HELM_CHART_VERSION="30.0.1" helm upgrade kube-prom-stack prometheus-community/kube-prometheus-stack --version "${HELM_CHART_VERSION}" \ --namespace monitoring \ -f "04-setup-prometheus-stack/assets/manifests/prom-stack-values-v${HELM_CHART_VERSION}.yaml"
After completing the above steps, check the PVC
status:
kubectl get pvc -n monitoring
The output looks similar to (STATUS
column should display Bound
):
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
kube-prome-prometheus-0 Bound pvc-768d85ff-17e7-4043-9aea-4929df6a35f4 5Gi RWO do-block-storage 4d2h
A new Volume
should appear in the Volumes web page, from your DigitalOcean
account panel:
In this step, you will learn how to enable persistent
storage for Grafana
, so that the graphs are persisted across server restarts, or in case of cluster failures. You will define a 5 Gi Persistent Volume Claim
(PVC), using the DigitalOcean Block Storage. The next steps are the same as Step 5 - Configuring Persistent Storage for Prometheus.
First, open the 04-setup-prometheus-stack/assets/manifests/prom-stack-values-v30.0.1.yaml
file provided in the Starter Kit
repository, using a text editor of your choice (preferably with YAML
lint support). The persistence
storage section for grafana
should look like:
grafana:
...
persistence:
enabled: true
storageClassName: do-block-storage
accessModes: ["ReadWriteOnce"]
size: 5Gi
Next, apply settings using Helm
:
HELM_CHART_VERSION="30.0.1"
helm upgrade kube-prom-stack prometheus-community/kube-prometheus-stack --version "${HELM_CHART_VERSION}" \
--namespace monitoring \
-f "04-setup-prometheus-stack/assets/manifests/prom-stack-values-v${HELM_CHART_VERSION}.yaml"
After completing the above steps, check the PVC
status:
kubectl get pvc -n monitoring
The output looks similar to (STATUS
column should display Bound
):
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
kube-prom-stack-grafana Bound pvc-768d85ff-17e7-4043-9aea-4929df6a35f4 5Gi RWO do-block-storage 4d2h
A new Volume
should appear in the Volumes web page, from your DigitalOcean
account panel:
In order to compute the size needed for the volume based on your needs, please follow the official documentation advices and formula:
Prometheus
stores an average of only1-2 bytes
per sample. Thus, toplan the capacity
of aPrometheus
server, you can use the rough formula:
needed_disk_space = retention_time_seconds * ingested_samples_per_second * bytes_per_sample
To lower the rate of ingested samples, you can either
reduce
thenumber of time series
you scrape (fewer targets or fewer series per target), or you canincrease
thescrape interval
. However,reducing
the number of series is likely more effective, due tocompression
of samples within a series.
Based on our findings, a 5GB
volume is enough for basic needs, like in the case of small development environments (as well as to complete the Starter Kit
tutorial). If 5GB
is not enough over time (depending on your use case), you need to adjust based on the volume of metrics ingested, and retention time needed, using the above mentioned formula.
Please follow the Operational Aspects section, for more details on the subject.
In this tutorial, you learned how to install
and configure
the Prometheus
stack, then used Grafana
to install new dashboards and visualize DOKS
cluster application metrics
. You also learned how to perform metric queries
using PromQL
. Finally, you configured and enabled persistent storage
for Prometheus
to use, to store your cluster metrics
.
Next, you will learn about application logs
collection and aggregation
via Loki
, to help you troubleshoot
running Kubernetes
cluster applications
in case something goes wrong.