diff --git a/assets/prometheus-federator/prometheus-federator-0.4.2.tgz b/assets/prometheus-federator/prometheus-federator-0.4.2.tgz new file mode 100644 index 00000000..c150be7c Binary files /dev/null and b/assets/prometheus-federator/prometheus-federator-0.4.2.tgz differ diff --git a/assets/rancher-project-monitoring/rancher-project-monitoring-0.4.2.tgz b/assets/rancher-project-monitoring/rancher-project-monitoring-0.4.2.tgz new file mode 100644 index 00000000..f7c51c86 Binary files /dev/null and b/assets/rancher-project-monitoring/rancher-project-monitoring-0.4.2.tgz differ diff --git a/charts/prometheus-federator/0.4.2/Chart.yaml b/charts/prometheus-federator/0.4.2/Chart.yaml new file mode 100644 index 00000000..b7f2297a --- /dev/null +++ b/charts/prometheus-federator/0.4.2/Chart.yaml @@ -0,0 +1,19 @@ +annotations: + catalog.cattle.io/certified: rancher + catalog.cattle.io/display-name: Prometheus Federator + catalog.cattle.io/namespace: cattle-monitoring-system + catalog.cattle.io/os: linux,windows + catalog.cattle.io/permits-os: linux,windows + catalog.cattle.io/provides-gvr: helm.cattle.io.projecthelmchart/v1alpha1 + catalog.cattle.io/release-name: prometheus-federator +apiVersion: v2 +appVersion: 0.3.5 +dependencies: +- condition: helmProjectOperator.enabled + name: helmProjectOperator + repository: file://./charts/helmProjectOperator + version: 0.2.1 +description: Prometheus Federator +icon: https://raw.githubusercontent.com/rancher/prometheus-federator/main/assets/logos/prometheus-federator.svg +name: prometheus-federator +version: 0.4.2 diff --git a/charts/prometheus-federator/0.4.2/README.md b/charts/prometheus-federator/0.4.2/README.md new file mode 100644 index 00000000..7da4edfc --- /dev/null +++ b/charts/prometheus-federator/0.4.2/README.md @@ -0,0 +1,120 @@ +# Prometheus Federator + +This chart is deploys a Helm Project Operator (based on the [rancher/helm-project-operator](https://github.com/rancher/helm-project-operator)), an operator that manages deploying Helm charts each containing a Project Monitoring Stack, where each stack contains: +- [Prometheus](https://prometheus.io/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Grafana](https://github.com/helm/charts/tree/master/stable/grafana) (deployed via an embedded Helm chart) +- Default PrometheusRules and Grafana dashboards based on the collection of community-curated resources from [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus/) +- Default ServiceMonitors that watch the deployed resources + +> **Important Note: Prometheus Federator is designed to be deployed alongside an existing Prometheus Operator deployment in a cluster that has already installed the Prometheus Operator CRDs.** + +By default, the chart is configured and intended to be deployed alongside [rancher-monitoring](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/), which deploys Prometheus Operator alongside a Cluster Prometheus that each Project Monitoring Stack is configured to federate namespace-scoped metrics from by default. + +## Pre-Installation: Using Prometheus Federator with Rancher and rancher-monitoring + +If you are running your cluster on [Rancher](https://rancher.com/) and already have [rancher-monitoring](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/) deployed onto your cluster, Prometheus Federator's default configuration should already be configured to work with your existing Cluster Monitoring Stack; however, here are some notes on how we recommend you configure rancher-monitoring to optimize the security and usability of Prometheus Federator in your cluster: + +### Ensure the cattle-monitoring-system namespace is placed into the System Project (or a similarly locked down Project that has access to other Projects in the cluster) + +Prometheus Operator's security model expects that the namespace it is deployed into (`cattle-monitoring-system`) has limited access for anyone except Cluster Admins to avoid privilege escalation via execing into Pods (such as the Jobs executing Helm operations). In addition, deploying Prometheus Federator and all Project Prometheus stacks into the System Project ensures that the each Project Prometheus is able to reach out to scrape workloads across all Projects (even if Network Policies are defined via Project Network Isolation) but has limited access for Project Owners, Project Members, and other users to be able to access data they shouldn't have access to (i.e. being allowed to exec into pods, set up the ability to scrape namespaces outside of a given Project, etc.). + +### Configure rancher-monitoring to only watch for resources created by the Helm chart itself + +Since each Project Monitoring Stack will watch the other namespaces and collect additional custom workload metrics or dashboards already, it's recommended to configure the following settings on all selectors to ensure that the Cluster Prometheus Stack only monitors resources created by the Helm Chart itself: + +``` +matchLabels: + release: "rancher-monitoring" +``` + +The following selector fields are recommended to have this value: +- `.Values.alertmanager.alertmanagerSpec.alertmanagerConfigSelector` +- `.Values.prometheus.prometheusSpec.serviceMonitorSelector` +- `.Values.prometheus.prometheusSpec.podMonitorSelector` +- `.Values.prometheus.prometheusSpec.ruleSelector` +- `.Values.prometheus.prometheusSpec.probeSelector` + +Once this setting is turned on, you can always create ServiceMonitors or PodMonitors that are picked up by the Cluster Prometheus by adding the label `release: "rancher-monitoring"` to them (in which case they will be ignored by Project Monitoring Stacks automatically by default, even if the namespace in which those ServiceMonitors or PodMonitors reside in are not system namespaces). + +> Note: If you don't want to allow users to be able to create ServiceMonitors and PodMonitors that aggregate into the Cluster Prometheus in Project namespaces, you can additionally set the namespaceSelectors on the chart to only target system namespaces (which must contain `cattle-monitoring-system` and `cattle-dashboards`, where resources are deployed into by default by rancher-monitoring; you will also need to monitor the `default` namespace to get apiserver metrics or create a custom ServiceMonitor to scrape apiserver metrics from the Service residing in the default namespace) to limit your Cluster Prometheus from picking up other Prometheus Operator CRs; in that case, it would be recommended to turn `.Values.prometheus.prometheusSpec.ignoreNamespaceSelectors=true` to allow you to define ServiceMonitors that can monitor non-system namespaces from within a system namespace. + +In addition, if you modified the default `.Values.grafana.sidecar.*.searchNamespace` values on the Grafana Helm subchart for Monitoring V2, it is also recommended to remove the overrides or ensure that your defaults are scoped to only system namespaces for the following values: +- `.Values.grafana.sidecar.dashboards.searchNamespace` (default `cattle-dashboards`) +- `.Values.grafana.sidecar.datasources.searchNamespace` (default `null`, which means it uses the release namespace `cattle-monitoring-system`) +- `.Values.grafana.sidecar.plugins.searchNamespace` (default `null`, which means it uses the release namespace `cattle-monitoring-system`) +- `.Values.grafana.sidecar.notifiers.searchNamespace` (default `null`, which means it uses the release namespace `cattle-monitoring-system`) + +### Increase the CPU / memory limits of the Cluster Prometheus + +Depending on a cluster's setup, it's generally recommended to give a large amount of dedicated memory to the Cluster Prometheus to avoid restarts due to out-of-memory errors (OOMKilled), usually caused by churn created in the cluster that causes a large number of high cardinality metrics to be generated and ingested by Prometheus within one block of time; this is one of the reasons why the default Rancher Monitoring stack expects around 4GB of RAM to be able to operate in a normal-sized cluster. However, when introducing Project Monitoring Stacks that are all sending `/federate` requests to the same Cluster Prometheus and are reliant on the Cluster Prometheus being "up" to federate that system data on their namespaces, it's even more important that the Cluster Prometheus has an ample amount of CPU / memory assigned to it to prevent an outage that can cause data gaps across all Project Prometheis in the cluster. + +> Note: There are no specific recommendations on how much memory the Cluster Prometheus should be configured with since it depends entirely on the user's setup (namely the likelihood of encountering a high churn rate and the scale of metrics that could be generated at that time); it generally varies per setup. + +## How does the operator work? + +1. On deploying this chart, users can create ProjectHelmCharts CRs with `spec.helmApiVersion` set to `monitoring.cattle.io/v1alpha1` (also known as "Project Monitors" in the Rancher UI) in a **Project Registration Namespace (`cattle-project-`)**. +2. On seeing each ProjectHelmChartCR, the operator will automatically deploy a Project Prometheus stack on the Project Owner's behalf in the **Project Release Namespace (`cattle-project--monitoring`)** based on a HelmChart CR and a HelmRelease CR automatically created by the ProjectHelmChart controller in the **Operator / System Namespace**. +3. RBAC will automatically be assigned in the Project Release Namespace to allow users to view the Prometheus, Alertmanager, and Grafana UIs of the Project Monitoring Stack deployed; this will be based on RBAC defined on the Project Registration Namespace against the [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) (see below for more information about configuring RBAC). + +### What is a Project? + +In Prometheus Federator, a Project is a group of namespaces that can be identified by a `metav1.LabelSelector`; by default, the label used to identify projects is `field.cattle.io/projectId`, the label used to identify namespaces that are contained within a given [Rancher](https://rancher.com/) Project. + +### Configuring the Helm release created by a ProjectHelmChart + +The `spec.values` of this ProjectHelmChart resources will correspond to the `values.yaml` override to be supplied to the underlying Helm chart deployed by the operator on the user's behalf; to see the underlying chart's `values.yaml` spec, either: +- View to the chart's definition located at [`rancher/prometheus-federator` under `charts/rancher-project-monitoring`](https://github.com/rancher/prometheus-federator/blob/main/charts/rancher-project-monitoring) (where the chart version will be tied to the version of this operator) +- Look for the ConfigMap named `monitoring.cattle.io.v1alpha1` that is automatically created in each Project Registration Namespace, which will contain both the `values.yaml` and `questions.yaml` that was used to configure the chart (which was embedded directly into the `prometheus-federator` binary). + +### Namespaces + +As a Project Operator based on [rancher/helm-project-operator](https://github.com/rancher/helm-project-operator), Prometheus Federator has three different classifications of namespaces that the operator looks out for: +1. **Operator / System Namespace**: this is the namespace that the operator is deployed into (e.g. `cattle-monitoring-system`). This namespace will contain all HelmCharts and HelmReleases for all ProjectHelmCharts watched by this operator. **Only Cluster Admins should have access to this namespace.** +2. **Project Registration Namespace (`cattle-project-`)**: this is the set of namespaces that the operator watches for ProjectHelmCharts within. The RoleBindings and ClusterRoleBindings that apply to this namespace will also be the source of truth for the auto-assigned RBAC created in the Project Release Namespace (see more details below). **Project Owners (admin), Project Members (edit), and Read-Only Members (view) should have access to this namespace**. +> Note: Project Registration Namespaces will be auto-generated by the operator and imported into the Project it is tied to if `.Values.global.cattle.projectLabel` is provided (which is set to `field.cattle.io/projectId` by default); this indicates that a Project Registration Namespace should be created by the operator if at least one namespace is observed with that label. The operator will not let these namespaces be deleted unless either all namespaces with that label are gone (e.g. this is the last namespace in that project, in which case the namespace will be marked with the label `"helm.cattle.io/helm-project-operator-orphaned": "true"`, which signals that it can be deleted) or it is no longer watching that project (because the project ID was provided under `.Values.helmProjectOperator.otherSystemProjectLabelValues`, which serves as a denylist for Projects). These namespaces will also never be auto-deleted to avoid destroying user data; it is recommended that users clean up these namespaces manually if desired on creating or deleting a project +> Note: if `.Values.global.cattle.projectLabel` is not provided, the Operator / System Namespace will also be the Project Registration Namespace +3. **Project Release Namespace (`cattle-project--monitoring`)**: this is the set of namespaces that the operator deploys Project Monitoring Stacks within on behalf of a ProjectHelmChart; the operator will also automatically assign RBAC to Roles created in this namespace by the Project Monitoring Stack based on bindings found in the Project Registration Namespace. **Only Cluster Admins should have access to this namespace; Project Owners (admin), Project Members (edit), and Read-Only Members (view) will be assigned limited access to this namespace by the deployed Helm Chart and Prometheus Federator.** +> Note: Project Release Namespaces are automatically deployed and imported into the project whose ID is specified under `.Values.helmProjectOperator.projectReleaseNamespaces.labelValue` (which defaults to the value of `.Values.global.cattle.systemProjectId` if not specified) whenever a ProjectHelmChart is specified in a Project Registration Namespace +> Note: Project Release Namespaces follow the same orphaning conventions as Project Registration Namespaces (see note above) +> Note: if `.Values.projectReleaseNamespaces.enabled` is false, the Project Release Namespace will be the same as the Project Registration Namespace + +### Helm Resources (HelmChart, HelmRelease) + +On deploying a ProjectHelmChart, the Prometheus Federator will automatically create and manage two child custom resources that manage the underlying Helm resources in turn: +- A HelmChart CR (managed via an embedded [k3s-io/helm-contoller](https://github.com/k3s-io/helm-controller) in the operator): this custom resource automatically creates a Job in the same namespace that triggers a `helm install`, `helm upgrade`, or `helm uninstall` depending on the change applied to the HelmChart CR; this CR is automatically updated on changes to the ProjectHelmChart (e.g. modifying the values.yaml) or changes to the underlying Project definition (e.g. adding or removing namespaces from a project). +> **Important Note: If a ProjectHelmChart is not deploying or updating the underlying Project Monitoring Stack for some reason, the Job created by this resource in the Operator / System namespace should be the first place you check to see if there's something wrong with the Helm operation; however, this is generally only accessible by a Cluster Admin.** +- A HelmRelease CR (managed via an embedded [rancher/helm-locker](https://github.com/rancher/helm-locker) in the operator): this custom resource automatically locks a deployed Helm release in place and automatically overwrites updates to underlying resources unless the change happens via a Helm operation (`helm install`, `helm upgrade`, or `helm uninstall` performed by the HelmChart CR). +> Note: HelmRelease CRs emit Kubernetes Events that detect when an underlying Helm release is being modified and locks it back to place; to view these events, you can use `kubectl describe helmrelease -n `; you can also view the logs on this operator to see when changes are detected and which resources were attempted to be modified + +Both of these resources are created for all Helm charts in the Operator / System namespaces to avoid escalation of privileges to underprivileged users. + +### RBAC + +As described in the section on namespaces above, Prometheus Federator expects that Project Owners, Project Members, and other users in the cluster with Project-level permissions (e.g. permissions in a certain set of namespaces identified by a single label selector) have minimal permissions in any namespaces except the Project Registration Namespace (which is imported into the project by default) and those that already comprise their projects. Therefore, in order to allow Project Owners to assign specific chart permissions to other users in their Project namespaces, the Helm Project Operator will automatically watch the following bindings: +- ClusterRoleBindings +- RoleBindings in the Project Release Namespace + +On observing a change to one of those types of bindings, the Helm Project Operator will check whether the `roleRef` that the the binding points to matches a ClusterRole with the name provided under `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.admin`, `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.edit`, or `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.view`; by default, these roleRefs correspond will correspond to `admin`, `edit`, and `view` respectively, which are the [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles). + +> Note: for Rancher RBAC users, these [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) directly correlate to the `Project Owner`, `Project Member`, and `Read-Only` default Project Role Templates. + +If the `roleRef` matches, the Helm Project Operator will filter the `subjects` of the binding for all Users and Groups and use that to automatically construct a RoleBinding for each Role in the Project Release Namespace with the same name as the role and the following labels: +- `helm.cattle.io/project-helm-chart-role: {{ .Release.Name }}` +- `helm.cattle.io/project-helm-chart-role-aggregate-from: ` + +By default, the `rancher-project-monitoring` (the underlying chart deployed by Prometheus Federator) creates three default Roles per Project Release Namespace that provide `admin`, `edit`, and `view` users to permissions to view the Prometheus, Alertmanager, and Grafana UIs of the Project Monitoring Stack to provide least privilege; however, if a Cluster Admin would like to assign additional permissions to certain users, they can either directly assign RoleBindings in the Project Release Namespace to certain users or created Roles with the above two labels on them to allow Project Owners to control assigning those RBAC roles to users in their Project Registration namespaces. + +### Advanced Helm Project Operator Configuration + +|Value|Configuration| +|---|---------------------------| +|`helmProjectOperator.valuesOverride`| Allows an Operator to override values that are set on each ProjectHelmChart deployment on an operator-level; user-provided options (specified on the `spec.values` of the ProjectHelmChart) are automatically overridden if operator-level values are provided. For an exmaple, see how the default value overrides `federate.targets` (note: when overriding list values like `federate.targets`, user-provided list values will **not** be concatenated) | +|`helmProjectOperator.projectReleaseNamespaces.labelValues`| The value of the Project that all Project Release Namespaces should be auto-imported into (via label and annotation). Not recommended to be overridden on a Rancher setup. | +|`helmProjectOperator.otherSystemProjectLabelValues`| Other namespaces that the operator should treat as a system namespace that should not be monitored. By default, all namespaces that match `global.cattle.systemProjectId` will not be matched. `cattle-monitoring-system`, `cattle-dashboards`, and `kube-system` are explicitly marked as system namespaces as well, regardless of label or annotation. | +|`helmProjectOperator.releaseRoleBindings.aggregate`| Whether to automatically create RBAC resources in Project Release namespaces +|`helmProjectOperator.releaseRoleBindings.clusterRoleRefs.`| ClusterRoles to reference to discover subjects to create RoleBindings for in the Project Release Namespace for all corresponding Project Release Roles. See RBAC above for more information | +|`helmProjectOperator.hardenedNamespaces.enabled`| Whether to automatically patch the default ServiceAccount with `automountServiceAccountToken: false` and create a default NetworkPolicy in all managed namespaces in the cluster; the default values ensure that the creation of the namespace does not break a CIS 1.16 hardened scan | +|`helmProjectOperator.hardenedNamespaces.configuration`| The configuration to be supplied to the default ServiceAccount or auto-generated NetworkPolicy on managing a namespace | +|`helmProjectOperator.helmController.enabled`| Whether to enable an embedded k3s-io/helm-controller instance within the Helm Project Operator. Should be disabled for RKE2/K3s clusters before v1.23.14 / v1.24.8 / v1.25.4 since RKE2/K3s clusters already run Helm Controller at a cluster-wide level to manage internal Kubernetes components | +|`helmProjectOperator.helmLocker.enabled`| Whether to enable an embedded rancher/helm-locker instance within the Helm Project Operator. | diff --git a/charts/prometheus-federator/0.4.2/app-README.md b/charts/prometheus-federator/0.4.2/app-README.md new file mode 100644 index 00000000..99fa7ca1 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/app-README.md @@ -0,0 +1,27 @@ +# Prometheus Federator + +This chart deploys an operator that manages Project Monitoring Stacks composed of the following set of resources that are scoped to project namespaces: +- [Prometheus](https://prometheus.io/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Grafana](https://github.com/helm/charts/tree/master/stable/grafana) (deployed via an embedded Helm chart) +- Default PrometheusRules and Grafana dashboards based on the collection of community-curated resources from [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus/) +- Default ServiceMonitors that watch the deployed Prometheus, Grafana, and Alertmanager + +Since this Project Monitoring Stack deploys Prometheus Operator CRs, an existing Prometheus Operator instance must already be deployed in the cluster for Prometheus Federator to successfully be able to deploy Project Monitoring Stacks. It is recommended to use [`rancher-monitoring`](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/) for this. For more information on how the chart works or advanced configurations, please read the `README.md`. + +## Upgrading to Kubernetes v1.25+ + +Starting in Kubernetes v1.25, [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy/) have been removed from the Kubernetes API. + +As a result, **before upgrading to Kubernetes v1.25** (or on a fresh install in a Kubernetes v1.25+ cluster), users are expected to perform an in-place upgrade of this chart with `global.cattle.psp.enabled` set to `false` if it has been previously set to `true`. +​ +> **Note:** +> In this chart release, any previous field that was associated with any PSP resources have been removed in favor of a single global field: `global.cattle.psp.enabled`. + ​ +> **Note:** +> If you upgrade your cluster to Kubernetes v1.25+ before removing PSPs via a `helm upgrade` (even if you manually clean up resources), **it will leave the Helm release in a broken state within the cluster such that further Helm operations will not work (`helm uninstall`, `helm upgrade`, etc.).** +> +> If your charts get stuck in this state, please consult the Rancher docs on how to clean up your Helm release secrets. +Upon setting `global.cattle.psp.enabled` to false, the chart will remove any PSP resources deployed on its behalf from the cluster. This is the default setting for this chart. +​ +As a replacement for PSPs, [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) should be used. Please consult the Rancher docs for more details on how to configure your chart release namespaces to work with the new Pod Security Admission and apply Pod Security Standards. \ No newline at end of file diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/Chart.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/Chart.yaml new file mode 100644 index 00000000..c4d14e1d --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/Chart.yaml @@ -0,0 +1,15 @@ +annotations: + catalog.cattle.io/certified: rancher + catalog.cattle.io/display-name: Helm Project Operator + catalog.cattle.io/kube-version: '>=1.16.0-0' + catalog.cattle.io/namespace: cattle-helm-system + catalog.cattle.io/os: linux,windows + catalog.cattle.io/permits-os: linux,windows + catalog.cattle.io/provides-gvr: helm.cattle.io.projecthelmchart/v1alpha1 + catalog.cattle.io/rancher-version: '>= 2.6.0-0' + catalog.cattle.io/release-name: helm-project-operator +apiVersion: v2 +appVersion: 0.2.1 +description: Helm Project Operator +name: helmProjectOperator +version: 0.2.1 diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/README.md b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/README.md new file mode 100644 index 00000000..fc1d39e8 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/README.md @@ -0,0 +1,77 @@ +# Helm Project Operator + +## How does the operator work? + +1. On deploying a Helm Project Operator, users can create ProjectHelmCharts CRs with `spec.helmApiVersion` set to `dummy.cattle.io/v1alpha1` in a **Project Registration Namespace (`cattle-project-`)**. +2. On seeing each ProjectHelmChartCR, the operator will automatically deploy the embedded Helm chart on the Project Owner's behalf in the **Project Release Namespace (`cattle-project--dummy`)** based on a HelmChart CR and a HelmRelease CR automatically created by the ProjectHelmChart controller in the **Operator / System Namespace**. +3. RBAC will automatically be assigned in the Project Release Namespace to allow users to based on Role created in the Project Release Namespace with a given set of labels; this will be based on RBAC defined on the Project Registration Namespace against the [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) (see below for more information about configuring RBAC). + +### What is a Project? + +In Helm Project Operator, a Project is a group of namespaces that can be identified by a `metav1.LabelSelector`; by default, the label used to identify projects is `field.cattle.io/projectId`, the label used to identify namespaces that are contained within a given [Rancher](https://rancher.com/) Project. + +### What is a ProjectHelmChart? + +A ProjectHelmChart is an instance of a (project-scoped) Helm chart deployed on behalf of a user who has permissions to create ProjectHelmChart resources in a Project Registration namespace. + +Generally, the best way to think about the ProjectHelmChart model is by comparing it to two other models: +1. Managed Kubernetes providers (EKS, GKE, AKS, etc.): in this model, a user has the ability to say "I want a Kubernetes cluster" but the underlying cloud provider is responsible for provisioning the infrastructure and offering **limited view and access** of the underlying resources created on their behalf; similarly, Helm Project Operator allows a Project Owner to say "I want this Helm chart deployed", but the underlying Operator is responsible for "provisioning" (deploying) the Helm chart and offering **limited view and access** of the underlying Kubernetes resources created on their behalf (based on configuring "least-privilege" Kubernetes RBAC for the Project Owners / Members in the newly created Project Release Namespace). +2. Dynamically-provisioned Persistent Volumes: in this model, a single resource (PersistentVolume) exists that allows you to specify a Storage Class that actually implements provisioning the underlying storage via a Storage Class Provisioner (e.g. Longhorn). Similarly, the ProjectHelmChart exists that allows you to specify a `spec.helmApiVersion` ("storage class") that actually implements deploying the underlying Helm chart via a Helm Project Operator (e.g. [`rancher/prometheus-federator`](https://github.com/rancher/prometheus-federator)). + +### Configuring the Helm release created by a ProjectHelmChart + +The `spec.values` of this ProjectHelmChart resources will correspond to the `values.yaml` override to be supplied to the underlying Helm chart deployed by the operator on the user's behalf; to see the underlying chart's `values.yaml` spec, either: +- View to the chart's definition located at [`rancher/helm-project-operator` under `charts/example-chart`](https://github.com/rancher/helm-project-operator/blob/main/charts/example-chart) (where the chart version will be tied to the version of this operator) +- Look for the ConfigMap named `dummy.cattle.io.v1alpha1` that is automatically created in each Project Registration Namespace, which will contain both the `values.yaml` and `questions.yaml` that was used to configure the chart (which was embedded directly into the `helm-project-operator` binary). + +### Namespaces + +All Helm Project Operators have three different classifications of namespaces that the operator looks out for: +1. **Operator / System Namespace**: this is the namespace that the operator is deployed into (e.g. `cattle-helm-system`). This namespace will contain all HelmCharts and HelmReleases for all ProjectHelmCharts watched by this operator. **Only Cluster Admins should have access to this namespace.** +2. **Project Registration Namespace (`cattle-project-`)**: this is the set of namespaces that the operator watches for ProjectHelmCharts within. The RoleBindings and ClusterRoleBindings that apply to this namespace will also be the source of truth for the auto-assigned RBAC created in the Project Release Namespace (see more details below). **Project Owners (admin), Project Members (edit), and Read-Only Members (view) should have access to this namespace**. +> Note: Project Registration Namespaces will be auto-generated by the operator and imported into the Project it is tied to if `.Values.global.cattle.projectLabel` is provided (which is set to `field.cattle.io/projectId` by default); this indicates that a Project Registration Namespace should be created by the operator if at least one namespace is observed with that label. The operator will not let these namespaces be deleted unless either all namespaces with that label are gone (e.g. this is the last namespace in that project, in which case the namespace will be marked with the label `"helm.cattle.io/helm-project-operator-orphaned": "true"`, which signals that it can be deleted) or it is no longer watching that project (because the project ID was provided under `.Values.helmProjectOperator.otherSystemProjectLabelValues`, which serves as a denylist for Projects). These namespaces will also never be auto-deleted to avoid destroying user data; it is recommended that users clean up these namespaces manually if desired on creating or deleting a project +> Note: if `.Values.global.cattle.projectLabel` is not provided, the Operator / System Namespace will also be the Project Registration Namespace +3. **Project Release Namespace (`cattle-project--dummy`)**: this is the set of namespaces that the operator deploys Helm charts within on behalf of a ProjectHelmChart; the operator will also automatically assign RBAC to Roles created in this namespace by the Helm charts based on bindings found in the Project Registration Namespace. **Only Cluster Admins should have access to this namespace; Project Owners (admin), Project Members (edit), and Read-Only Members (view) will be assigned limited access to this namespace by the deployed Helm Chart and Helm Project Operator.** +> Note: Project Release Namespaces are automatically deployed and imported into the project whose ID is specified under `.Values.helmProjectOperator.projectReleaseNamespaces.labelValue` (which defaults to the value of `.Values.global.cattle.systemProjectId` if not specified) whenever a ProjectHelmChart is specified in a Project Registration Namespace +> Note: Project Release Namespaces follow the same orphaning conventions as Project Registration Namespaces (see note above) +> Note: if `.Values.projectReleaseNamespaces.enabled` is false, the Project Release Namespace will be the same as the Project Registration Namespace + +### Helm Resources (HelmChart, HelmRelease) + +On deploying a ProjectHelmChart, the Helm Project Operator will automatically create and manage two child custom resources that manage the underlying Helm resources in turn: +- A HelmChart CR (managed via an embedded [k3s-io/helm-contoller](https://github.com/k3s-io/helm-controller) in the operator): this custom resource automatically creates a Job in the same namespace that triggers a `helm install`, `helm upgrade`, or `helm uninstall` depending on the change applied to the HelmChart CR; this CR is automatically updated on changes to the ProjectHelmChart (e.g. modifying the values.yaml) or changes to the underlying Project definition (e.g. adding or removing namespaces from a project). +> **Important Note: If a ProjectHelmChart is not deploying or updating the underlying Project Monitoring Stack for some reason, the Job created by this resource in the Operator / System namespace should be the first place you check to see if there's something wrong with the Helm operation; however, this is generally only accessible by a Cluster Admin.** +- A HelmRelease CR (managed via an embedded [rancher/helm-locker](https://github.com/rancher/helm-locker) in the operator): this custom resource automatically locks a deployed Helm release in place and automatically overwrites updates to underlying resources unless the change happens via a Helm operation (`helm install`, `helm upgrade`, or `helm uninstall` performed by the HelmChart CR). +> Note: HelmRelease CRs emit Kubernetes Events that detect when an underlying Helm release is being modified and locks it back to place; to view these events, you can use `kubectl describe helmrelease -n `; you can also view the logs on this operator to see when changes are detected and which resources were attempted to be modified + +Both of these resources are created for all Helm charts in the Operator / System namespaces to avoid escalation of privileges to underprivileged users. + +### RBAC + +As described in the section on namespaces above, Helm Project Operator expects that Project Owners, Project Members, and other users in the cluster with Project-level permissions (e.g. permissions in a certain set of namespaces identified by a single label selector) have minimal permissions in any namespaces except the Project Registration Namespace (which is imported into the project by default) and those that already comprise their projects. Therefore, in order to allow Project Owners to assign specific chart permissions to other users in their Project namespaces, the Helm Project Operator will automatically watch the following bindings: +- ClusterRoleBindings +- RoleBindings in the Project Release Namespace + +On observing a change to one of those types of bindings, the Helm Project Operator will check whether the `roleRef` that the the binding points to matches a ClusterRole with the name provided under `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.admin`, `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.edit`, or `helmProjectOperator.releaseRoleBindings.clusterRoleRefs.view`; by default, these roleRefs correspond will correspond to `admin`, `edit`, and `view` respectively, which are the [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles). + +> Note: for Rancher RBAC users, these [default Kubernetes user-facing roles](https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles) directly correlate to the `Project Owner`, `Project Member`, and `Read-Only` default Project Role Templates. + +If the `roleRef` matches, the Helm Project Operator will filter the `subjects` of the binding for all Users and Groups and use that to automatically construct a RoleBinding for each Role in the Project Release Namespace with the same name as the role and the following labels: +- `helm.cattle.io/project-helm-chart-role: {{ .Release.Name }}` +- `helm.cattle.io/project-helm-chart-role-aggregate-from: ` + +By default, the `example-chart` (the underlying chart deployed by Helm Project Operator) does not create any default roles; however, if a Cluster Admin would like to assign additional permissions to certain users, they can either directly assign RoleBindings in the Project Release Namespace to certain users or created Roles with the above two labels on them to allow Project Owners to control assigning those RBAC roles to users in their Project Registration namespaces. + +### Advanced Helm Project Operator Configuration + +|Value|Configuration| +|---|---------------------------| +|`valuesOverride`| Allows an Operator to override values that are set on each ProjectHelmChart deployment on an operator-level; user-provided options (specified on the `spec.values` of the ProjectHelmChart) are automatically overridden if operator-level values are provided. For an exmaple, see how the default value overrides `federate.targets` (note: when overriding list values like `federate.targets`, user-provided list values will **not** be concatenated) | +|`projectReleaseNamespaces.labelValues`| The value of the Project that all Project Release Namespaces should be auto-imported into (via label and annotation). Not recommended to be overridden on a Rancher setup. | +|`otherSystemProjectLabelValues`| Other namespaces that the operator should treat as a system namespace that should not be monitored. By default, all namespaces that match `global.cattle.systemProjectId` will not be matched. `kube-system` is explicitly marked as a system namespace as well, regardless of label or annotation. | +|`releaseRoleBindings.aggregate`| Whether to automatically create RBAC resources in Project Release namespaces +|`releaseRoleBindings.clusterRoleRefs.`| ClusterRoles to reference to discover subjects to create RoleBindings for in the Project Release Namespace for all corresponding Project Release Roles. See RBAC above for more information | +|`hardenedNamespaces.enabled`| Whether to automatically patch the default ServiceAccount with `automountServiceAccountToken: false` and create a default NetworkPolicy in all managed namespaces in the cluster; the default values ensure that the creation of the namespace does not break a CIS 1.16 hardened scan | +|`hardenedNamespaces.configuration`| The configuration to be supplied to the default ServiceAccount or auto-generated NetworkPolicy on managing a namespace | +|`helmController.enabled`| Whether to enable an embedded k3s-io/helm-controller instance within the Helm Project Operator. Should be disabled for RKE2 clusters since RKE2 clusters already run Helm Controller to manage internal Kubernetes components | +|`helmLocker.enabled`| Whether to enable an embedded rancher/helm-locker instance within the Helm Project Operator. | diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/app-readme.md b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/app-readme.md new file mode 100644 index 00000000..fd551467 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/app-readme.md @@ -0,0 +1,20 @@ +# Helm Project Operator + +This chart installs the example [Helm Project Operator](https://github.com/rancher/helm-project-operator) onto your cluster. + +## Upgrading to Kubernetes v1.25+ + +Starting in Kubernetes v1.25, [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy/) have been removed from the Kubernetes API. + +As a result, **before upgrading to Kubernetes v1.25** (or on a fresh install in a Kubernetes v1.25+ cluster), users are expected to perform an in-place upgrade of this chart with `global.cattle.psp.enabled` set to `false` if it has been previously set to `true`. +​ +> **Note:** +> In this chart release, any previous field that was associated with any PSP resources have been removed in favor of a single global field: `global.cattle.psp.enabled`. + ​ +> **Note:** +> If you upgrade your cluster to Kubernetes v1.25+ before removing PSPs via a `helm upgrade` (even if you manually clean up resources), **it will leave the Helm release in a broken state within the cluster such that further Helm operations will not work (`helm uninstall`, `helm upgrade`, etc.).** +> +> If your charts get stuck in this state, please consult the Rancher docs on how to clean up your Helm release secrets. +Upon setting `global.cattle.psp.enabled` to false, the chart will remove any PSP resources deployed on its behalf from the cluster. This is the default setting for this chart. +​ +As a replacement for PSPs, [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) should be used. Please consult the Rancher docs for more details on how to configure your chart release namespaces to work with the new Pod Security Admission and apply Pod Security Standards. \ No newline at end of file diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/questions.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/questions.yaml new file mode 100644 index 00000000..054361a7 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/questions.yaml @@ -0,0 +1,43 @@ +questions: +- variable: global.cattle.psp.enabled + default: "false" + description: "Flag to enable or disable the installation of PodSecurityPolicies by this chart in the target cluster. If the cluster is running Kubernetes 1.25+, you must update this value to false." + label: "Enable PodSecurityPolicies" + type: boolean + group: "Security Settings" +- variable: helmController.enabled + label: Enable Embedded Helm Controller + description: 'Note: If you are running this chart in an RKE2 cluster, this should be disabled.' + type: boolean + group: Helm Controller +- variable: helmLocker.enabled + label: Enable Embedded Helm Locker + type: boolean + group: Helm Locker +- variable: projectReleaseNamespaces.labelValue + label: Project Release Namespace Project ID + description: By default, the System Project is selected. This can be overriden to a different Project (e.g. p-xxxxx) + type: string + required: false + group: Namespaces +- variable: releaseRoleBindings.clusterRoleRefs.admin + label: Admin ClusterRole + description: By default, admin selects Project Owners. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: admin + required: false + group: RBAC +- variable: releaseRoleBindings.clusterRoleRefs.edit + label: Edit ClusterRole + description: By default, edit selects Project Members. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: edit + required: false + group: RBAC +- variable: releaseRoleBindings.clusterRoleRefs.view + label: View ClusterRole + description: By default, view selects Read-Only users. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: view + required: false + group: RBAC diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/NOTES.txt b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/NOTES.txt new file mode 100644 index 00000000..32baeebc --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/NOTES.txt @@ -0,0 +1,2 @@ +{{ $.Chart.Name }} has been installed. Check its status by running: + kubectl --namespace {{ template "helm-project-operator.namespace" . }} get pods -l "release={{ $.Release.Name }}" diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/_helpers.tpl b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/_helpers.tpl new file mode 100644 index 00000000..97dd6b36 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/_helpers.tpl @@ -0,0 +1,66 @@ +# Rancher +{{- define "system_default_registry" -}} +{{- if .Values.global.cattle.systemDefaultRegistry -}} +{{- printf "%s/" .Values.global.cattle.systemDefaultRegistry -}} +{{- end -}} +{{- end -}} + +# Windows Support + +{{/* +Windows cluster will add default taint for linux nodes, +add below linux tolerations to workloads could be scheduled to those linux nodes +*/}} + +{{- define "linux-node-tolerations" -}} +- key: "cattle.io/os" + value: "linux" + effect: "NoSchedule" + operator: "Equal" +{{- end -}} + +{{- define "linux-node-selector" -}} +{{- if semverCompare "<1.14-0" .Capabilities.KubeVersion.GitVersion -}} +beta.kubernetes.io/os: linux +{{- else -}} +kubernetes.io/os: linux +{{- end -}} +{{- end -}} + +# Helm Project Operator + +{{/* vim: set filetype=mustache: */}} +{{/* Expand the name of the chart. This is suffixed with -alertmanager, which means subtract 13 from longest 63 available */}} +{{- define "helm-project-operator.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 50 | trimSuffix "-" -}} +{{- end }} + +{{/* +Allow the release namespace to be overridden for multi-namespace deployments in combined charts +*/}} +{{- define "helm-project-operator.namespace" -}} + {{- if .Values.namespaceOverride -}} + {{- .Values.namespaceOverride -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* Create chart name and version as used by the chart label. */}} +{{- define "helm-project-operator.chartref" -}} +{{- replace "+" "_" .Chart.Version | printf "%s-%s" .Chart.Name -}} +{{- end }} + +{{/* Generate basic labels */}} +{{- define "helm-project-operator.labels" -}} +app.kubernetes.io/managed-by: {{ .Release.Service }} +app.kubernetes.io/instance: {{ .Release.Name }} +app.kubernetes.io/version: "{{ replace "+" "_" .Chart.Version }}" +app.kubernetes.io/part-of: {{ template "helm-project-operator.name" . }} +chart: {{ template "helm-project-operator.chartref" . }} +release: {{ $.Release.Name | quote }} +heritage: {{ $.Release.Service | quote }} +{{- if .Values.commonLabels}} +{{ toYaml .Values.commonLabels }} +{{- end }} +{{- end -}} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/cleanup.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/cleanup.yaml new file mode 100644 index 00000000..98675642 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/cleanup.yaml @@ -0,0 +1,82 @@ +apiVersion: batch/v1 +kind: Job +metadata: + name: {{ template "helm-project-operator.name" . }}-cleanup + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} + annotations: + "helm.sh/hook": pre-delete + "helm.sh/hook-delete-policy": before-hook-creation, hook-succeeded, hook-failed +spec: + template: + metadata: + name: {{ template "helm-project-operator.name" . }}-cleanup + labels: {{ include "helm-project-operator.labels" . | nindent 8 }} + app: {{ template "helm-project-operator.name" . }} + spec: + serviceAccountName: {{ template "helm-project-operator.name" . }} +{{- if .Values.cleanup.securityContext }} + securityContext: {{ toYaml .Values.cleanup.securityContext | nindent 8 }} +{{- end }} + initContainers: + - name: add-cleanup-annotations + image: {{ template "system_default_registry" . }}{{ .Values.cleanup.image.repository }}:{{ .Values.cleanup.image.tag }} + imagePullPolicy: "{{ .Values.image.pullPolicy }}" + command: + - /bin/sh + - -c + - > + echo "Labeling all ProjectHelmCharts with helm.cattle.io/helm-project-operator-cleanup=true"; + EXPECTED_HELM_API_VERSION={{ .Values.helmApiVersion }}; + IFS=$'\n'; + for namespace in $(kubectl get namespaces -l helm.cattle.io/helm-project-operated=true --no-headers -o=custom-columns=NAME:.metadata.name); do + for projectHelmChartAndHelmApiVersion in $(kubectl get projecthelmcharts -n ${namespace} --no-headers -o=custom-columns=NAME:.metadata.name,HELMAPIVERSION:.spec.helmApiVersion); do + projectHelmChartAndHelmApiVersion=$(echo ${projectHelmChartAndHelmApiVersion} | xargs); + projectHelmChart=$(echo ${projectHelmChartAndHelmApiVersion} | cut -d' ' -f1); + helmApiVersion=$(echo ${projectHelmChartAndHelmApiVersion} | cut -d' ' -f2); + if [[ ${helmApiVersion} != ${EXPECTED_HELM_API_VERSION} ]]; then + echo "Skipping marking ${namespace}/${projectHelmChart} with cleanup annotation since spec.helmApiVersion: ${helmApiVersion} is not ${EXPECTED_HELM_API_VERSION}"; + continue; + fi; + kubectl label projecthelmcharts -n ${namespace} ${projectHelmChart} helm.cattle.io/helm-project-operator-cleanup=true --overwrite; + done; + done; +{{- if .Values.cleanup.resources }} + resources: {{ toYaml .Values.cleanup.resources | nindent 12 }} +{{- end }} +{{- if .Values.cleanup.containerSecurityContext }} + securityContext: {{ toYaml .Values.cleanup.containerSecurityContext | nindent 12 }} +{{- end }} + containers: + - name: ensure-subresources-deleted + image: {{ template "system_default_registry" . }}{{ .Values.cleanup.image.repository }}:{{ .Values.cleanup.image.tag }} + imagePullPolicy: IfNotPresent + command: + - /bin/sh + - -c + - > + SYSTEM_NAMESPACE={{ .Release.Namespace }} + EXPECTED_HELM_API_VERSION={{ .Values.helmApiVersion }}; + HELM_API_VERSION_TRUNCATED=$(echo ${EXPECTED_HELM_API_VERSION} | cut -d'/' -f0); + echo "Ensuring HelmCharts and HelmReleases are deleted from ${SYSTEM_NAMESPACE}..."; + while [[ "$(kubectl get helmcharts,helmreleases -l helm.cattle.io/helm-api-version=${HELM_API_VERSION_TRUNCATED} -n ${SYSTEM_NAMESPACE} 2>&1)" != "No resources found in ${SYSTEM_NAMESPACE} namespace." ]]; do + echo "waiting for HelmCharts and HelmReleases to be deleted from ${SYSTEM_NAMESPACE}... sleeping 3 seconds"; + sleep 3; + done; + echo "Successfully deleted all HelmCharts and HelmReleases in ${SYSTEM_NAMESPACE}!"; +{{- if .Values.cleanup.resources }} + resources: {{ toYaml .Values.cleanup.resources | nindent 12 }} +{{- end }} +{{- if .Values.cleanup.containerSecurityContext }} + securityContext: {{ toYaml .Values.cleanup.containerSecurityContext | nindent 12 }} +{{- end }} + restartPolicy: OnFailure + nodeSelector: {{ include "linux-node-selector" . | nindent 8 }} + {{- if .Values.cleanup.nodeSelector }} + {{- toYaml .Values.cleanup.nodeSelector | nindent 8 }} + {{- end }} + tolerations: {{ include "linux-node-tolerations" . | nindent 8 }} + {{- if .Values.cleanup.tolerations }} + {{- toYaml .Values.cleanup.tolerations | nindent 8 }} + {{- end }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/clusterrole.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/clusterrole.yaml new file mode 100644 index 00000000..60ed263b --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/clusterrole.yaml @@ -0,0 +1,57 @@ +{{- if and .Values.global.rbac.create .Values.global.rbac.userRoles.create }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: {{ template "helm-project-operator.name" . }}-admin + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + rbac.authorization.k8s.io/aggregate-to-admin: "true" + {{- end }} +rules: +- apiGroups: + - helm.cattle.io + resources: + - projecthelmcharts + - projecthelmcharts/finalizers + - projecthelmcharts/status + verbs: + - '*' +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: {{ template "helm-project-operator.name" . }}-edit + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + rbac.authorization.k8s.io/aggregate-to-edit: "true" + {{- end }} +rules: +- apiGroups: + - helm.cattle.io + resources: + - projecthelmcharts + - projecthelmcharts/status + verbs: + - 'get' + - 'list' + - 'watch' +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: {{ template "helm-project-operator.name" . }}-view + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + rbac.authorization.k8s.io/aggregate-to-view: "true" + {{- end }} +rules: +- apiGroups: + - helm.cattle.io + resources: + - projecthelmcharts + - projecthelmcharts/status + verbs: + - 'get' + - 'list' + - 'watch' +{{- end }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/configmap.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/configmap.yaml new file mode 100644 index 00000000..d4def157 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/configmap.yaml @@ -0,0 +1,14 @@ +## Note: If you add another entry to this ConfigMap, make sure a corresponding env var is set +## in the deployment of the operator to ensure that a Helm upgrade will force the operator +## to reload the values in the ConfigMap and redeploy +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "helm-project-operator.name" . }}-config + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} +data: + hardened.yaml: |- +{{ .Values.hardenedNamespaces.configuration | toYaml | indent 4 }} + values.yaml: |- +{{ .Values.valuesOverride | toYaml | indent 4 }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/deployment.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/deployment.yaml new file mode 100644 index 00000000..33b81e72 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/deployment.yaml @@ -0,0 +1,126 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ template "helm-project-operator.name" . }} + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +spec: + {{- if .Values.replicas }} + replicas: {{ .Values.replicas }} + {{- end }} + selector: + matchLabels: + app: {{ template "helm-project-operator.name" . }} + release: {{ $.Release.Name | quote }} + template: + metadata: + labels: {{ include "helm-project-operator.labels" . | nindent 8 }} + app: {{ template "helm-project-operator.name" . }} + spec: + containers: + - name: {{ template "helm-project-operator.name" . }} + image: "{{ template "system_default_registry" . }}{{ .Values.image.repository }}:{{ .Values.image.tag }}" + imagePullPolicy: "{{ .Values.image.pullPolicy }}" + args: + - {{ template "helm-project-operator.name" . }} + - --namespace={{ template "helm-project-operator.namespace" . }} + - --controller-name={{ template "helm-project-operator.name" . }} + - --values-override-file=/etc/helmprojectoperator/config/values.yaml +{{- if .Values.global.cattle.systemDefaultRegistry }} + - --system-default-registry={{ .Values.global.cattle.systemDefaultRegistry }} +{{- end }} +{{- if .Values.global.cattle.url }} + - --cattle-url={{ .Values.global.cattle.url }} +{{- end }} +{{- if .Values.global.cattle.projectLabel }} + - --project-label={{ .Values.global.cattle.projectLabel }} +{{- end }} +{{- if not .Values.projectReleaseNamespaces.enabled }} + - --system-project-label-values={{ join "," (append .Values.otherSystemProjectLabelValues .Values.global.cattle.systemProjectId) }} +{{- else if and (ne (len .Values.global.cattle.systemProjectId) 0) (ne (len .Values.projectReleaseNamespaces.labelValue) 0) (ne .Values.projectReleaseNamespaces.labelValue .Values.global.cattle.systemProjectId) }} + - --system-project-label-values={{ join "," (append .Values.otherSystemProjectLabelValues .Values.global.cattle.systemProjectId) }} +{{- else if len .Values.otherSystemProjectLabelValues }} + - --system-project-label-values={{ join "," .Values.otherSystemProjectLabelValues }} +{{- end }} +{{- if .Values.projectReleaseNamespaces.enabled }} +{{- if .Values.projectReleaseNamespaces.labelValue }} + - --project-release-label-value={{ .Values.projectReleaseNamespaces.labelValue }} +{{- else if .Values.global.cattle.systemProjectId }} + - --project-release-label-value={{ .Values.global.cattle.systemProjectId }} +{{- end }} +{{- end }} +{{- if .Values.global.cattle.clusterId }} + - --cluster-id={{ .Values.global.cattle.clusterId }} +{{- end }} +{{- if .Values.releaseRoleBindings.aggregate }} +{{- if .Values.releaseRoleBindings.clusterRoleRefs }} +{{- if .Values.releaseRoleBindings.clusterRoleRefs.admin }} + - --admin-cluster-role={{ .Values.releaseRoleBindings.clusterRoleRefs.admin }} +{{- end }} +{{- if .Values.releaseRoleBindings.clusterRoleRefs.edit }} + - --edit-cluster-role={{ .Values.releaseRoleBindings.clusterRoleRefs.edit }} +{{- end }} +{{- if .Values.releaseRoleBindings.clusterRoleRefs.view }} + - --view-cluster-role={{ .Values.releaseRoleBindings.clusterRoleRefs.view }} +{{- end }} +{{- end }} +{{- end }} +{{- if .Values.hardenedNamespaces.enabled }} + - --hardening-options-file=/etc/helmprojectoperator/config/hardening.yaml +{{- else }} + - --disable-hardening +{{- end }} +{{- if .Values.debug }} + - --debug + - --debug-level={{ .Values.debugLevel }} +{{- end }} +{{- if not .Values.helmController.enabled }} + - --disable-embedded-helm-controller +{{- else }} + - --helm-job-image={{ template "system_default_registry" . }}{{ .Values.helmController.job.image.repository }}:{{ .Values.helmController.job.image.tag }} +{{- end }} +{{- if not .Values.helmLocker.enabled }} + - --disable-embedded-helm-locker +{{- end }} +{{- if .Values.additionalArgs }} +{{- toYaml .Values.additionalArgs | nindent 10 }} +{{- end }} + env: + - name: NODE_NAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + ## Note: The below two values only exist to force Helm to upgrade the deployment on + ## a change to the contents of the ConfigMap during an upgrade. Neither serve + ## any practical purpose and can be removed and replaced with a configmap reloader + ## in a future change if dynamic updates are required. + - name: HARDENING_OPTIONS_SHA_256_HASH + value: {{ .Values.hardenedNamespaces.configuration | toYaml | sha256sum }} + - name: VALUES_OVERRIDE_SHA_256_HASH + value: {{ .Values.valuesOverride | toYaml | sha256sum }} +{{- if .Values.resources }} + resources: {{ toYaml .Values.resources | nindent 12 }} +{{- end }} +{{- if .Values.containerSecurityContext }} + securityContext: {{ toYaml .Values.containerSecurityContext | nindent 12 }} +{{- end }} + volumeMounts: + - name: config + mountPath: "/etc/helmprojectoperator/config" + serviceAccountName: {{ template "helm-project-operator.name" . }} +{{- if .Values.securityContext }} + securityContext: {{ toYaml .Values.securityContext | nindent 8 }} +{{- end }} + nodeSelector: {{ include "linux-node-selector" . | nindent 8 }} +{{- if .Values.nodeSelector }} +{{- toYaml .Values.nodeSelector | nindent 8 }} +{{- end }} + tolerations: {{ include "linux-node-tolerations" . | nindent 8 }} +{{- if .Values.tolerations }} +{{- toYaml .Values.tolerations | nindent 8 }} +{{- end }} + volumes: + - name: config + configMap: + name: {{ template "helm-project-operator.name" . }}-config diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/psp.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/psp.yaml new file mode 100644 index 00000000..73dcc456 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/psp.yaml @@ -0,0 +1,68 @@ +{{- if .Values.global.cattle.psp.enabled }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ template "helm-project-operator.name" . }}-psp + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +{{- if .Values.global.rbac.pspAnnotations }} + annotations: {{ toYaml .Values.global.rbac.pspAnnotations | nindent 4 }} +{{- end }} +spec: + privileged: false + hostNetwork: false + hostIPC: false + hostPID: false + runAsUser: + # Permits the container to run with root privileges as well. + rule: 'RunAsAny' + seLinux: + # This policy assumes the nodes are using AppArmor rather than SELinux. + rule: 'RunAsAny' + supplementalGroups: + rule: 'MustRunAs' + ranges: + # Forbid adding the root group. + - min: 0 + max: 65535 + fsGroup: + rule: 'MustRunAs' + ranges: + # Forbid adding the root group. + - min: 0 + max: 65535 + readOnlyRootFilesystem: false +--- +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: {{ template "helm-project-operator.name" . }}-psp + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +rules: +{{- if semverCompare "> 1.15.0-0" .Capabilities.KubeVersion.GitVersion }} +- apiGroups: ['policy'] +{{- else }} +- apiGroups: ['extensions'] +{{- end }} + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: + - {{ template "helm-project-operator.name" . }}-psp +--- +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: {{ template "helm-project-operator.name" . }}-psp + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: {{ template "helm-project-operator.name" . }}-psp +subjects: + - kind: ServiceAccount + name: {{ template "helm-project-operator.name" . }} + namespace: {{ template "helm-project-operator.namespace" . }} +{{- end }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/rbac.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/rbac.yaml new file mode 100644 index 00000000..b1c40920 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/rbac.yaml @@ -0,0 +1,32 @@ +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: {{ template "helm-project-operator.name" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: "cluster-admin" # see note below +subjects: +- kind: ServiceAccount + name: {{ template "helm-project-operator.name" . }} + namespace: {{ template "helm-project-operator.namespace" . }} +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ template "helm-project-operator.name" . }} + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} + app: {{ template "helm-project-operator.name" . }} +{{- if .Values.global.imagePullSecrets }} +imagePullSecrets: {{ toYaml .Values.global.imagePullSecrets | nindent 2 }} +{{- end }} +# --- +# NOTE: +# As of now, due to the fact that the k3s-io/helm-controller can only deploy jobs that are cluster-bound to the cluster-admin +# ClusterRole, the only way for this operator to be able to perform that binding is if it is also bound to the cluster-admin ClusterRole. +# +# As a result, this ClusterRoleBinding will be left as a work-in-progress until changes are made in k3s-io/helm-controller to allow us to grant +# only scoped down permissions to the Job that is deployed. diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/system-namespaces-configmap.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/system-namespaces-configmap.yaml new file mode 100644 index 00000000..f4c85254 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/system-namespaces-configmap.yaml @@ -0,0 +1,62 @@ +{{- if .Values.systemNamespacesConfigMap.create }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "helm-project-operator.name" . }}-system-namespaces + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} +data: + system-namespaces.json: |- + { +{{- if .Values.projectReleaseNamespaces.enabled }} +{{- if .Values.projectReleaseNamespaces.labelValue }} + "projectReleaseLabelValue": {{ .Values.projectReleaseNamespaces.labelValue | quote }}, +{{- else if .Values.global.cattle.systemProjectId }} + "projectReleaseLabelValue": {{ .Values.global.cattle.systemProjectId | quote }}, +{{- else }} + "projectReleaseLabelValue": "", +{{- end }} +{{- else }} + "projectReleaseLabelValue": "", +{{- end }} +{{- if not .Values.projectReleaseNamespaces.enabled }} + "systemProjectLabelValues": {{ append .Values.otherSystemProjectLabelValues .Values.global.cattle.systemProjectId | toJson }} +{{- else if and (ne (len .Values.global.cattle.systemProjectId) 0) (ne (len .Values.projectReleaseNamespaces.labelValue) 0) (ne .Values.projectReleaseNamespaces.labelValue .Values.global.cattle.systemProjectId) }} + "systemProjectLabelValues": {{ append .Values.otherSystemProjectLabelValues .Values.global.cattle.systemProjectId | toJson }} +{{- else if len .Values.otherSystemProjectLabelValues }} + "systemProjectLabelValues": {{ .Values.otherSystemProjectLabelValues | toJson }} +{{- else }} + "systemProjectLabelValues": [] +{{- end }} + } +--- +{{- if (and .Values.systemNamespacesConfigMap.rbac.enabled .Values.systemNamespacesConfigMap.rbac.subjects) }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ template "helm-project-operator.name" . }}-system-namespaces + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} +rules: +- apiGroups: + - "" + resources: + - configmaps + resourceNames: + - "{{ template "helm-project-operator.name" . }}-system-namespaces" + verbs: + - 'get' +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: {{ template "helm-project-operator.name" . }}-system-namespaces + namespace: {{ template "helm-project-operator.namespace" . }} + labels: {{ include "helm-project-operator.labels" . | nindent 4 }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: {{ template "helm-project-operator.name" . }}-system-namespaces +subjects: {{ .Values.systemNamespacesConfigMap.rbac.subjects | toYaml | nindent 2 }} +{{- end }} +{{- end }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/validate-psp-install.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/validate-psp-install.yaml new file mode 100644 index 00000000..a30c59d3 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/templates/validate-psp-install.yaml @@ -0,0 +1,7 @@ +#{{- if gt (len (lookup "rbac.authorization.k8s.io/v1" "ClusterRole" "" "")) 0 -}} +#{{- if .Values.global.cattle.psp.enabled }} +#{{- if not (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") }} +#{{- fail "The target cluster does not have the PodSecurityPolicy API resource. Please disable PSPs in this chart before proceeding." -}} +#{{- end }} +#{{- end }} +#{{- end }} diff --git a/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/values.yaml b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/values.yaml new file mode 100644 index 00000000..63fae45a --- /dev/null +++ b/charts/prometheus-federator/0.4.2/charts/helmProjectOperator/values.yaml @@ -0,0 +1,228 @@ +# Default values for helm-project-operator. +# This is a YAML-formatted file. +# Declare variables to be passed into your templates. + +# Helm Project Operator Configuration + +global: + cattle: + clusterId: "" + psp: + enabled: false + projectLabel: field.cattle.io/projectId + systemDefaultRegistry: "" + systemProjectId: "" + url: "" + rbac: + ## Create RBAC resources for ServiceAccounts and users + ## + create: true + + userRoles: + ## Create default user ClusterRoles to allow users to interact with ProjectHelmCharts + create: true + ## Aggregate default user ClusterRoles into default k8s ClusterRoles + aggregateToDefaultRoles: true + + pspAnnotations: {} + ## Specify pod annotations + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl + ## + # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*' + # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default' + # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' + + ## Reference to one or more secrets to be used when pulling images + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ + ## + imagePullSecrets: [] + # - name: "image-pull-secret" + +helmApiVersion: dummy.cattle.io/v1alpha1 + +## valuesOverride overrides values that are set on each ProjectHelmChart deployment on an operator-level +## User-provided values will be overwritten based on the values provided here +valuesOverride: {} + +## projectReleaseNamespaces are auto-generated namespaces that are created to host Helm Releases +## managed by this operator on behalf of a ProjectHelmChart +projectReleaseNamespaces: + ## Enabled determines whether Project Release Namespaces should be created. If false, the underlying + ## Helm release will be deployed in the Project Registration Namespace + enabled: true + ## labelValue is the value of the Project that the projectReleaseNamespace should be created within + ## If empty, this will be set to the value of global.cattle.systemProjectId + ## If global.cattle.systemProjectId is also empty, project release namespaces will be disabled + labelValue: "" + +## otherSystemProjectLabelValues are project labels that identify namespaces as those that should be treated as system projects +## i.e. they will be entirely ignored by the operator +## By default, the global.cattle.systemProjectId will be in this list +otherSystemProjectLabelValues: [] + +## releaseRoleBindings configures RoleBindings automatically created by the Helm Project Operator +## in Project Release Namespaces where underlying Helm charts are deployed +releaseRoleBindings: + ## aggregate enables creating these RoleBindings off aggregating RoleBindings in the + ## Project Registration Namespace or ClusterRoleBindings that bind users to the ClusterRoles + ## specified under clusterRoleRefs + aggregate: true + + ## clusterRoleRefs are the ClusterRoles whose RoleBinding or ClusterRoleBindings should determine + ## the RoleBindings created in the Project Release Namespace + ## + ## By default, these are set to create RoleBindings based on the RoleBindings / ClusterRoleBindings + ## attached to the default K8s user-facing ClusterRoles of admin, edit, and view. + ## ref: https://kubernetes.io/docs/reference/access-authn-authz/rbac/#user-facing-roles + ## + clusterRoleRefs: + admin: admin + edit: edit + view: view + +hardenedNamespaces: + # Whether to automatically manage the configuration of the default ServiceAccount and + # auto-create a NetworkPolicy for each namespace created by this operator + enabled: true + + configuration: + # Values to be applied to each default ServiceAccount created in a managed namespace + serviceAccountSpec: + secrets: [] + imagePullSecrets: [] + automountServiceAccountToken: false + # Values to be applied to each default generated NetworkPolicy created in a managed namespace + networkPolicySpec: + podSelector: {} + egress: [] + ingress: [] + policyTypes: ["Ingress", "Egress"] + +## systemNamespacesConfigMap is a ConfigMap created to allow users to see valid entries +## for registering a ProjectHelmChart for a given Project on the Rancher Dashboard UI. +## It does not need to be enabled for a non-Rancher use case. +systemNamespacesConfigMap: + ## Create indicates whether the system namespaces configmap should be created + ## This is a required value for integration with Rancher Dashboard + create: true + + ## RBAC provides options around the RBAC created to allow users to be able to view + ## the systemNamespacesConfigMap; if not specified, only users with the ability to + ## view ConfigMaps in the namespace where this chart is deployed will be able to + ## properly view the system namespaces on the Rancher Dashboard UI + rbac: + ## enabled indicates that we should deploy a RoleBinding and Role to view this ConfigMap + enabled: true + ## subjects are the subjects that should be bound to this default RoleBinding + ## By default, we allow anyone who is authenticated to the system to be able to view + ## this ConfigMap in the deployment namespace + subjects: + - kind: Group + name: system:authenticated + +nameOverride: "" + +namespaceOverride: "" + +replicas: 1 + +image: + repository: rancher/helm-project-operator + tag: v0.2.1 + pullPolicy: IfNotPresent + +helmController: + # Note: should be disabled for RKE2 clusters since they already run Helm Controller to manage internal Kubernetes components + enabled: true + + job: + image: + repository: rancher/klipper-helm + tag: v0.7.0-build20220315 + +helmLocker: + enabled: true + +# Additional arguments to be passed into the Helm Project Operator image +additionalArgs: [] + +## Define which Nodes the Pods are scheduled on. +## ref: https://kubernetes.io/docs/user-guide/node-selection/ +## +nodeSelector: {} + +## Tolerations for use with node taints +## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ +## +tolerations: [] +# - key: "key" +# operator: "Equal" +# value: "value" +# effect: "NoSchedule" + +resources: {} + # limits: + # memory: 500Mi + # cpu: 1000m + # requests: + # memory: 100Mi + # cpu: 100m + +containerSecurityContext: {} + # allowPrivilegeEscalation: false + # capabilities: + # drop: + # - ALL + # privileged: false + # readOnlyRootFilesystem: true + +securityContext: {} + # runAsGroup: 1000 + # runAsUser: 1000 + # supplementalGroups: + # - 1000 + +debug: false +debugLevel: 0 + +cleanup: + image: + repository: rancher/shell + tag: v0.1.19 + pullPolicy: IfNotPresent + + ## Define which Nodes the Pods are scheduled on. + ## ref: https://kubernetes.io/docs/user-guide/node-selection/ + ## + nodeSelector: {} + + ## Tolerations for use with node taints + ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ + ## + tolerations: [] + # - key: "key" + # operator: "Equal" + # value: "value" + # effect: "NoSchedule" + + containerSecurityContext: {} + # allowPrivilegeEscalation: false + # capabilities: + # drop: + # - ALL + # privileged: false + # readOnlyRootFilesystem: true + + securityContext: + runAsNonRoot: false + runAsUser: 0 + + resources: {} + # limits: + # memory: 500Mi + # cpu: 1000m + # requests: + # memory: 100Mi + # cpu: 100m diff --git a/charts/prometheus-federator/0.4.2/questions.yaml b/charts/prometheus-federator/0.4.2/questions.yaml new file mode 100644 index 00000000..87cf1339 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/questions.yaml @@ -0,0 +1,43 @@ +questions: +- variable: global.cattle.psp.enabled + default: "false" + description: "Flag to enable or disable the installation of PodSecurityPolicies by this chart in the target cluster. If the cluster is running Kubernetes 1.25+, you must update this value to false." + label: "Enable PodSecurityPolicies" + type: boolean + group: "Security Settings" +- variable: helmProjectOperator.helmController.enabled + label: Enable Embedded Helm Controller + description: 'Note: If you are running Prometheus Federator in an RKE2 / K3s cluster before v1.23.14 / v1.24.8 / v1.25.4, this should be disabled.' + type: boolean + group: Helm Controller +- variable: helmProjectOperator.helmLocker.enabled + label: Enable Embedded Helm Locker + type: boolean + group: Helm Locker +- variable: helmProjectOperator.projectReleaseNamespaces.labelValue + label: Project Release Namespace Project ID + description: By default, the System Project is selected. This can be overriden to a different Project (e.g. p-xxxxx) + type: string + required: false + group: Namespaces +- variable: helmProjectOperator.releaseRoleBindings.clusterRoleRefs.admin + label: Admin ClusterRole + description: By default, admin selects Project Owners. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: admin + required: false + group: RBAC +- variable: helmProjectOperator.releaseRoleBindings.clusterRoleRefs.edit + label: Edit ClusterRole + description: By default, edit selects Project Members. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: edit + required: false + group: RBAC +- variable: helmProjectOperator.releaseRoleBindings.clusterRoleRefs.view + label: View ClusterRole + description: By default, view selects Read-Only users. This can be overridden to a different ClusterRole (e.g. rt-xxxxx) + type: string + default: view + required: false + group: RBAC \ No newline at end of file diff --git a/charts/prometheus-federator/0.4.2/templates/NOTES.txt b/charts/prometheus-federator/0.4.2/templates/NOTES.txt new file mode 100644 index 00000000..f551f366 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/templates/NOTES.txt @@ -0,0 +1,3 @@ +{{ $.Chart.Name }} has been installed. Check its status by running: + kubectl --namespace {{ template "prometheus-federator.namespace" . }} get pods -l "release={{ $.Release.Name }}" + diff --git a/charts/prometheus-federator/0.4.2/templates/_helpers.tpl b/charts/prometheus-federator/0.4.2/templates/_helpers.tpl new file mode 100644 index 00000000..15ea4e5c --- /dev/null +++ b/charts/prometheus-federator/0.4.2/templates/_helpers.tpl @@ -0,0 +1,66 @@ +# Rancher +{{- define "system_default_registry" -}} +{{- if .Values.global.cattle.systemDefaultRegistry -}} +{{- printf "%s/" .Values.global.cattle.systemDefaultRegistry -}} +{{- end -}} +{{- end -}} + +# Windows Support + +{{/* +Windows cluster will add default taint for linux nodes, +add below linux tolerations to workloads could be scheduled to those linux nodes +*/}} + +{{- define "linux-node-tolerations" -}} +- key: "cattle.io/os" + value: "linux" + effect: "NoSchedule" + operator: "Equal" +{{- end -}} + +{{- define "linux-node-selector" -}} +{{- if semverCompare "<1.14-0" .Capabilities.KubeVersion.GitVersion -}} +beta.kubernetes.io/os: linux +{{- else -}} +kubernetes.io/os: linux +{{- end -}} +{{- end -}} + +# Helm Project Operator + +{{/* vim: set filetype=mustache: */}} +{{/* Expand the name of the chart. This is suffixed with -alertmanager, which means subtract 13 from longest 63 available */}} +{{- define "prometheus-federator.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 50 | trimSuffix "-" -}} +{{- end }} + +{{/* +Allow the release namespace to be overridden for multi-namespace deployments in combined charts +*/}} +{{- define "prometheus-federator.namespace" -}} + {{- if .Values.namespaceOverride -}} + {{- .Values.namespaceOverride -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* Create chart name and version as used by the chart label. */}} +{{- define "prometheus-federator.chartref" -}} +{{- replace "+" "_" .Chart.Version | printf "%s-%s" .Chart.Name -}} +{{- end }} + +{{/* Generate basic labels */}} +{{- define "prometheus-federator.labels" }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +app.kubernetes.io/instance: {{ .Release.Name }} +app.kubernetes.io/version: "{{ replace "+" "_" .Chart.Version }}" +app.kubernetes.io/part-of: {{ template "prometheus-federator.name" . }} +chart: {{ template "prometheus-federator.chartref" . }} +release: {{ $.Release.Name | quote }} +heritage: {{ $.Release.Service | quote }} +{{- if .Values.commonLabels}} +{{ toYaml .Values.commonLabels }} +{{- end }} +{{- end }} diff --git a/charts/prometheus-federator/0.4.2/values.yaml b/charts/prometheus-federator/0.4.2/values.yaml new file mode 100644 index 00000000..103c1604 --- /dev/null +++ b/charts/prometheus-federator/0.4.2/values.yaml @@ -0,0 +1,94 @@ +# Default values for helm-project-operator. +# This is a YAML-formatted file. +# Declare variables to be passed into your templates. + +# Prometheus Federator Configuration + +global: + cattle: + psp: + enabled: false + systemDefaultRegistry: "" + projectLabel: field.cattle.io/projectId + clusterId: "" + systemProjectId: "" + url: "" + rbac: + pspEnabled: true + pspAnnotations: {} + ## Specify pod annotations + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl + ## + # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*' + # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default' + # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' + + ## Reference to one or more secrets to be used when pulling images + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ + ## + imagePullSecrets: [] + # - name: "image-pull-secret" + +helmProjectOperator: + enabled: true + + # ensures that all resources created by subchart show up as prometheus-federator + helmApiVersion: monitoring.cattle.io/v1alpha1 + + nameOverride: prometheus-federator + + helmController: + # Note: should be disabled for RKE2 clusters since they already run Helm Controller to manage internal Kubernetes components + enabled: true + + helmLocker: + enabled: true + + ## valuesOverride overrides values that are set on each Project Prometheus Stack Helm Chart deployment on an operator level + ## all values provided here will override any user-provided values automatically + valuesOverride: + + federate: + # Change this to point at all Prometheuses you want all your Project Prometheus Stacks to federate from + # By default, this matches the default deployment of Rancher Monitoring + targets: + - rancher-monitoring-prometheus.cattle-monitoring-system.svc:9090 + + image: + repository: rancher/prometheus-federator + tag: v0.3.5 + pullPolicy: IfNotPresent + + # Additional arguments to be passed into the Prometheus Federator image + additionalArgs: [] + + ## Define which Nodes the Pods are scheduled on. + ## ref: https://kubernetes.io/docs/user-guide/node-selection/ + ## + nodeSelector: {} + + ## Tolerations for use with node taints + ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ + ## + tolerations: [] + # - key: "key" + # operator: "Equal" + # value: "value" + # effect: "NoSchedule" + + resources: {} + # limits: + # memory: 500Mi + # cpu: 1000m + # requests: + # memory: 100Mi + # cpu: 100m + + securityContext: {} + # allowPrivilegeEscalation: false + # readOnlyRootFilesystem: true + + debug: false + debugLevel: 0 \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/.editorconfig b/charts/rancher-project-monitoring/0.4.2/.editorconfig new file mode 100644 index 00000000..f5ee2f46 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/.editorconfig @@ -0,0 +1,5 @@ +root = true + +[files/dashboards/*.json] +indent_size = 2 +indent_style = space \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/.helmignore b/charts/rancher-project-monitoring/0.4.2/.helmignore new file mode 100644 index 00000000..9bdbec92 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/.helmignore @@ -0,0 +1,29 @@ +# Patterns to ignore when building packages. +# This supports shell glob matching, relative path matching, and +# negation (prefixed with !). Only one pattern per line. +.DS_Store +# Common VCS dirs +.git/ +.gitignore +.bzr/ +.bzrignore +.hg/ +.hgignore +.svn/ +# Common backup files +*.swp +*.bak +*.tmp +*~ +# Various IDEs +.project +.idea/ +*.tmproj +# helm/charts +OWNERS +hack/ +ci/ +kube-prometheus-*.tgz + +unittests/ +files/dashboards/ diff --git a/charts/rancher-project-monitoring/0.4.2/CHANGELOG.md b/charts/rancher-project-monitoring/0.4.2/CHANGELOG.md new file mode 100644 index 00000000..8178169b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/CHANGELOG.md @@ -0,0 +1,47 @@ +# Changelog +All notable changes from the upstream Prometheus Operator chart will be added to this file. + +## [Package Version 00] - 2020-07-19 +### Added +- Added [Prometheus Adapter](https://github.com/helm/charts/tree/master/stable/prometheus-adapter) as a dependency to the upstream Prometheus Operator chart to allow users to expose custom metrics from the default Prometheus instance deployed by this chart +- Remove `prometheus-operator/cleanup-crds.yaml` and `prometheus-operator/crds.yaml` from the Prometheus Operator upstream chart in favor of just using the CRD directory to install the CRDs. +- Added support for `rkeControllerManager`, `rkeScheduler`, `rkeProxy`, and `rkeEtcd` PushProx exporters for monitoring k8s components within RKE clusters +- Added support for a `k3sServer` PushProx exporter that monitors k3s server components (`kubeControllerManager`, `kubeScheduler`, and `kubeProxy`) within k3s clusters +- Added support for `kubeAdmControllerManager`, `kubeAdmScheduler`, `kubeAdmProxy`, and `kubeAdmEtcd` PushProx exporters for monitoring k8s components within kubeAdm clusters +- Added support for `rke2ControllerManager`, `rke2Scheduler`, `rke2Proxy`, and `rke2Etcd` PushProx exporters for monitoring k8s components within rke2 clusters +- Exposed `prometheus.prometheusSpec.ignoreNamespaceSelectors` on values.yaml and set it to `false` by default. This value instructs the default Prometheus server deployed with this chart to ignore the `namespaceSelector` field within any created ServiceMonitor or PodMonitor CRs that it selects. This prevents ServiceMonitors and PodMonitors from configuring the Prometheus scrape configuration to monitor resources outside the namespace that they are deployed in; if a user needs to have one ServiceMonitor / PodMonitor monitor resources within several namespaces (such as the resources that are used to monitor Istio in a default installation), they should not enable this option since it would require them to create one ServiceMonitor / PodMonitor CR per namespace that they would like to monitor. Relevant fields were also updated in the default README.md. +- Added `grafana.sidecar.dashboards.searchNamespace` to `values.yaml` with a default value of `cattle-dashboards`. The namespace provided should contain all ConfigMaps with the label `grafana_dashboard` and will be searched by the Grafana Dashboards sidecar for updates. The namespace specified is also created along with this deployment. All default dashboard ConfigMaps have been relocated from the deployment namespace to the namespace specified +- Added `monitoring-admin`, `monitoring-edit`, and `monitoring-view` default `ClusterRoles` to allow admins to assign roles to users to interact with Prometheus Operator CRs. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`). In a typical RBAC setup, you might want to use a `ClusterRoleBinding` to bind these roles to a Subject to allow them to set up or view `ServiceMonitors` / `PodMonitors` / `PrometheusRules` and view `Prometheus` or `Alertmanager` CRs across the cluster. If `.Values.global.rbac.userRoles.aggregateRolesForRBAC` is enabled, these ClusterRoles will aggregate into the respective default ClusterRoles provided by Kubernetes +- Added `monitoring-config-admin`, `monitoring-config-edit` and `monitoring-config-view` default `Roles` to allow admins to assign roles to users to be able to edit / view `Secrets` and `ConfigMaps` within the `cattle-monitoring-system` namespace. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`). In a typical RBAC setup, you might want to use a `RoleBinding` to bind these roles to a Subject within the `cattle-monitoring-system` namespace to allow them to modify Secrets / ConfigMaps tied to the deployment, such as your Alertmanager Config Secret. +- Added `monitoring-dashboard-admin`, `monitoring-dashboard-edit` and `monitoring-dashboard-view` default `Roles` to allow admins to assign roles to users to be able to edit / view `ConfigMaps` within the `cattle-dashboards` namespace. These can be enabled by setting `.Values.global.rbac.userRoles.create` (default: `true`) and deploying Grafana as part of this chart. In a typical RBAC setup, you might want to use a `RoleBinding` to bind these roles to a Subject within the `cattle-dashboards` namespace to allow them to create / modify ConfigMaps that contain the JSON used to persist Grafana Dashboards on the cluster. +- Added default resource limits for `Prometheus Operator`, `Prometheus`, `AlertManager`, `Grafana`, `kube-state-metrics`, `node-exporter` +- Added a default template `rancher_defaults.tmpl` to AlertManager that Rancher will offer to users in order to help configure the way alerts are rendered on a notifier. Also updated the default template deployed with this chart to reference that template and added an example of a Slack config using this template as a comment in the `values.yaml`. +- Added support for private registries via introducing a new field for `global.cattle.systemDefaultRegistry` that, if supplied, will automatically be prepended onto every image used by the chart. +- Added a default `nginx` proxy container deployed with Grafana whose config is set in the `ConfigMap` located in `charts/grafana/templates/nginx-config.yaml`. The purpose of this container is to make it possible to view Grafana's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port `8080` (with a `portName` of `nginx-http` instead of the default `service`), which is also where the Grafana service will now point to, and will forward all requests to the Grafana container listening on the default port `3000`. +- Added a default `nginx` proxy container deployed with Prometheus whose config is set in the `ConfigMap` located in `templates/prometheus/nginx-config.yaml`. The purpose of this container is to make it possible to view Prometheus's UI through a proxy that has a subpath (e.g. Rancher's proxy). This proxy container is set to listen on port `8081` (with a `portName` of `nginx-http` instead of the default `web`), which is also where the Prometheus service will now point to, and will forward all requests to the Prometheus container listening on the default port `9090`. +- Added support for passing CIS Scans in a hardened cluster by introducing a Job that patches the default service account within the `cattle-monitoring-system` and `cattle-dashboards` namespaces on install or upgrade and adding a default allow all `NetworkPolicy` to the `cattle-monitoring-system` and `cattle-dashboards` namespaces. +### Modified +- Updated the chart name from `prometheus-operator` to `rancher-monitoring` and added the `io.rancher.certified: rancher` annotation to `Chart.yaml` +- Modified the default `node-exporter` port from `9100` to `9796` +- Modified the default `nameOverride` to `rancher-monitoring`. This change is necessary as the Prometheus Adapter's default URL (`http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc`) is based off of the value used here; if modified, the default Adapter URL must also be modified +- Modified the default `namespaceOverride` to `cattle-monitoring-system`. This change is necessary as the Prometheus Adapter's default URL (`http://{{ .Values.nameOverride }}-prometheus.{{ .Values.namespaceOverride }}.svc`) is based off of the value used here; if modified, the default Adapter URL must also be modified +- Configured some default values for `grafana.service` values and exposed them in the default README.md +- The default namespaces the following ServiceMonitors were changed from the deployment namespace to allow them to continue to monitor metrics when `prometheus.prometheusSpec.ignoreNamespaceSelectors` is enabled: + - `core-dns`: `kube-system` + - `api-server`: `default` + - `kube-controller-manager`: `kube-system` + - `kubelet`: `{{ .Values.kubelet.namespace }}` +- Disabled the following deployments by default (can be enabled if required): + - `AlertManager` + - `kube-controller-manager` metrics exporter + - `kube-etcd` metrics exporter + - `kube-scheduler` metrics exporter + - `kube-proxy` metrics exporter +- Updated default Grafana `deploymentStrategy` to `Recreate` to prevent deployments from being stuck on upgrade if a PV is attached to Grafana +- Modified the default `SelectorNilUsesHelmValues` to default to `false`. As a result, we look for all CRs with any labels in all namespaces by default rather than just the ones tagged with the label `release: rancher-monitoring`. +- Modified the default images used by the `rancher-monitoring` chart to point to Rancher mirrors of the original images from upstream. +- Modified the behavior of the chart to create the Alertmanager Config Secret via a pre-install hook instead of using the normal Helm lifecycle to manage the secret. The benefit of this approach is that all changes to the Config Secret done on a live cluster will never get overridden on a `helm upgrade` since the secret only gets created on a `helm install`. If you would like the secret to be cleaned up on an `helm uninstall`, enable `alertmanager.cleanupOnUninstall`; however, this is disabled by default to prevent the loss of alerting configuration on an uninstall. This secret will never be modified on a `helm upgrade`. +- Modified the default `securityContext` for `Pod` templates across the chart to `{"runAsNonRoot": "true", "runAsUser": "1000"}` and replaced `grafana.rbac.pspUseAppArmor` in favor of `grafana.rbac.pspAnnotations={}` in order to make it possible to deploy this chart on a hardened cluster which does not support Seccomp or AppArmor annotations in PSPs. Users can always choose to specify the annotations they want to use for the PSP directly as part of the values provided. +- Modified `.Values.prometheus.prometheusSpec.containers` to take in a string representing a template that should be rendered by Helm (via `tpl`) instead of allowing a user to provide YAML directly. +- Modified the default Grafana configuration to auto assign users who access Grafana to the Viewer role and enable anonymous access to Grafana dashboards by default. This default works well for a Rancher user who is accessing Grafana via the `kubectl proxy` on the Rancher Dashboard UI since anonymous users who enter via the proxy are authenticated by the k8s API Server, but you can / should modify this behavior if you plan on exposing Grafana in a way that does not require authentication (e.g. as a `NodePort` service). +- Modified the default Grafana configuration to add a default dashboard for Rancher on the Grafana home page. \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/CONTRIBUTING.md b/charts/rancher-project-monitoring/0.4.2/CONTRIBUTING.md new file mode 100644 index 00000000..f6ce2a32 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/CONTRIBUTING.md @@ -0,0 +1,12 @@ +# Contributing Guidelines + +## How to contribute to this chart + +1. Fork this repository, develop and test your Chart. +1. Bump the chart version for every change. +1. Ensure PR title has the prefix `[kube-prometheus-stack]` +1. When making changes to rules or dashboards, see the README.md section on how to sync data from upstream repositories +1. Check the `hack/minikube` folder has scripts to set up minikube and components of this chart that will allow all components to be scraped. You can use this configuration when validating your changes. +1. Check for changes of RBAC rules. +1. Check for changes in CRD specs. +1. PR must pass the linter (`helm lint`) diff --git a/charts/rancher-project-monitoring/0.4.2/Chart.yaml b/charts/rancher-project-monitoring/0.4.2/Chart.yaml new file mode 100644 index 00000000..9518bb29 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/Chart.yaml @@ -0,0 +1,51 @@ +annotations: + artifacthub.io/license: Apache-2.0 + artifacthub.io/links: | + - name: Chart Source + url: https://github.com/prometheus-community/helm-charts + - name: Upstream Project + url: https://github.com/prometheus-operator/kube-prometheus + artifacthub.io/operator: "true" + catalog.cattle.io/auto-install: rancher-monitoring-crd=match + catalog.cattle.io/certified: rancher + catalog.cattle.io/deploys-on-os: windows + catalog.cattle.io/display-name: Monitoring + catalog.cattle.io/kube-version: '>= 1.26.0-0 < 1.31.0-0' + catalog.cattle.io/namespace: cattle-monitoring-system + catalog.cattle.io/permits-os: linux,windows + catalog.cattle.io/provides-gvr: monitoring.coreos.com.prometheus/v1 + catalog.cattle.io/rancher-version: '>= 2.9.0-0 < 2.10.0-0' + catalog.cattle.io/release-name: rancher-monitoring + catalog.cattle.io/requests-cpu: 4500m + catalog.cattle.io/requests-memory: 4000Mi + catalog.cattle.io/type: cluster-tool + catalog.cattle.io/ui-component: monitoring + catalog.cattle.io/upstream-version: 57.0.3 +apiVersion: v2 +appVersion: v0.72.0 +dependencies: +- condition: grafana.enabled + name: grafana + repository: file://./charts/grafana +description: kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, + and Prometheus rules combined with documentation and scripts to provide easy to + operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus + Operator. +home: https://github.com/prometheus-operator/kube-prometheus +icon: file://assets/logos/rancher-monitoring.png +keywords: +- operator +- prometheus +- kube-prometheus +kubeVersion: '>=1.19.0-0' +maintainers: +- email: alexandre.lamarre@suse.com + name: Alexandre +- email: joshua.meranda@suse.com + name: Joshua +name: rancher-project-monitoring +sources: +- https://github.com/prometheus-community/helm-charts +- https://github.com/prometheus-operator/kube-prometheus +type: application +version: 0.4.2 diff --git a/charts/rancher-project-monitoring/0.4.2/README.md b/charts/rancher-project-monitoring/0.4.2/README.md new file mode 100644 index 00000000..ac82ec69 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/README.md @@ -0,0 +1,29 @@ +# rancher-project-monitoring + +This chart installs a project-scoped version of [`rancher-monitoring`](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting), a Helm chart based off of [`kube-prometheus stack`](https://github.com/prometheus-operator/kube-prometheus). It deploys a collection of Kubernetes manifests, [Grafana](http://grafana.com/) dashboards, and [Prometheus rules](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) combined with documentation and scripts to provide easy to operate end-to-end Kubernetes project monitoring with [Prometheus](https://prometheus.io/) using the [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator). See the [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) README for details about components, dashboards, and alerts. + +## Prerequisites + +- Kubernetes 1.16+ +- Helm 3+ + +## Install Chart + +This chart is not intended for standalone use; it's intended to be deployed via [Prometheus Federator](https://github.com/rancher/prometheus-federator). For a Prometheus Stack intended to be deployed standalone, please use [rancher-monitoring](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/) or the upstream [`kube-prometheus-stack`](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) project. + +## Dependencies + +This chart is designed to be deployed alongside an existing Prometheus Operator deployment in a cluster that has already installed the Prometheus Operator CRDs. Specifically, the chart is configured and intended to be deployed alongside [`rancher-monitoring`](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/), which deploys Prometheus Operator alongside a Cluster Prometheus that `rancher-project-monitoring` is configured to federate namespace-scoped metrics from by default. + +### Configuration + +Since this chart installs a project-scoped version of [`rancher-monitoring`](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/), a Helm chart based off of [`kube-prometheus-stack`](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack), most of the options that apply to either of those charts will apply to this chart (e.g. support for configuring persistent volumes, ingresses, etc.) and can be passed in as part of the `spec.values` of the ProjectHelmChart that deploys this chart; however, certain advanced functionality (such as Thanos support) and options that pose security risks in Project environments (e.g. ability to `ignoreNamespaceSelectors` or modify the existing namepaceSelectors of the Cluster Prometheus, ability to mount additional scrape configs, etc.) have been removed from the `values.yaml` of the chart. For more information on how to configure values and what they mean, please see the comments and options provided on the `values.yaml` packaged with this chart. + +## Further Information + +For more in-depth documentation of configuration options meanings, please see + +- [`rancher-monitoring`](https://rancher.com/docs/rancher/v2.6/en/monitoring-alerting/) +- [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator) +- [Prometheus](https://prometheus.io/docs/introduction/overview/) +- [Grafana](https://github.com/grafana/helm-charts/tree/main/charts/grafana#grafana-helm-chart) diff --git a/charts/rancher-project-monitoring/0.4.2/app-README.md b/charts/rancher-project-monitoring/0.4.2/app-README.md new file mode 100644 index 00000000..8d3f558e --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/app-README.md @@ -0,0 +1,10 @@ +# Rancher Project Monitoring and Alerting + +The chart installs a Project Monitoring Stack, which contains: +- [Prometheus](https://prometheus.io/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/) (managed externally by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator)) +- [Grafana](https://github.com/helm/charts/tree/master/stable/grafana) (deployed via an embedded Helm chart) +- Default PrometheusRules and Grafana dashboards based on the collection of community-curated resources from [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus/) +- Default ServiceMonitors that watch the deployed resources + +Note: This chart is not intended for standalone use; it's intended to be deployed via [Prometheus Federator](https://github.com/rancher/prometheus-federator). diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/.helmignore b/charts/rancher-project-monitoring/0.4.2/charts/grafana/.helmignore new file mode 100644 index 00000000..8cade131 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/.helmignore @@ -0,0 +1,23 @@ +# Patterns to ignore when building packages. +# This supports shell glob matching, relative path matching, and +# negation (prefixed with !). Only one pattern per line. +.DS_Store +# Common VCS dirs +.git/ +.gitignore +.bzr/ +.bzrignore +.hg/ +.hgignore +.svn/ +# Common backup files +*.swp +*.bak +*.tmp +*~ +# Various IDEs +.vscode +.project +.idea/ +*.tmproj +OWNERS diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/Chart.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/Chart.yaml new file mode 100644 index 00000000..ff6bcb26 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/Chart.yaml @@ -0,0 +1,39 @@ +annotations: + artifacthub.io/license: Apache-2.0 + artifacthub.io/links: | + - name: Chart Source + url: https://github.com/grafana/helm-charts + - name: Upstream Project + url: https://github.com/grafana/grafana + catalog.cattle.io/hidden: "true" + catalog.cattle.io/kube-version: '>= 1.26.0-0 < 1.31.0-0' + catalog.cattle.io/os: linux + catalog.rancher.io/certified: rancher + catalog.rancher.io/namespace: cattle-monitoring-system + catalog.rancher.io/release-name: rancher-grafana +apiVersion: v2 +appVersion: 10.4.1 +description: The leading tool for querying and visualizing time series and metrics. +home: https://grafana.com +icon: https://artifacthub.io/image/b4fed1a7-6c8f-4945-b99d-096efa3e4116 +keywords: +- monitoring +- metric +kubeVersion: '>=1.26.0-0' +maintainers: +- email: zanhsieh@gmail.com + name: zanhsieh +- email: rluckie@cisco.com + name: rtluckie +- email: maor.friedman@redhat.com + name: maorfr +- email: miroslav.hadzhiev@gmail.com + name: Xtigyro +- email: mail@torstenwalter.de + name: torstenwalter +name: grafana +sources: +- https://github.com/grafana/grafana +- https://github.com/grafana/helm-charts +type: application +version: 7.3.11 diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/README.md b/charts/rancher-project-monitoring/0.4.2/charts/grafana/README.md new file mode 100644 index 00000000..0ff07f29 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/README.md @@ -0,0 +1,770 @@ +# Grafana Helm Chart + +* Installs the web dashboarding system [Grafana](http://grafana.org/) + +## Get Repo Info + +```console +helm repo add grafana https://grafana.github.io/helm-charts +helm repo update +``` + +_See [helm repo](https://helm.sh/docs/helm/helm_repo/) for command documentation._ + +## Installing the Chart + +To install the chart with the release name `my-release`: + +```console +helm install my-release grafana/grafana +``` + +## Uninstalling the Chart + +To uninstall/delete the my-release deployment: + +```console +helm delete my-release +``` + +The command removes all the Kubernetes components associated with the chart and deletes the release. + +## Upgrading an existing Release to a new major version + +A major chart version change (like v1.2.3 -> v2.0.0) indicates that there is an +incompatible breaking change needing manual actions. + +### To 4.0.0 (And 3.12.1) + +This version requires Helm >= 2.12.0. + +### To 5.0.0 + +You have to add --force to your helm upgrade command as the labels of the chart have changed. + +### To 6.0.0 + +This version requires Helm >= 3.1.0. + +### To 7.0.0 + +For consistency with other Helm charts, the `global.image.registry` parameter was renamed +to `global.imageRegistry`. If you were not previously setting `global.image.registry`, no action +is required on upgrade. If you were previously setting `global.image.registry`, you will +need to instead set `global.imageRegistry`. + +## Configuration + +| Parameter | Description | Default | +|-------------------------------------------|-----------------------------------------------|---------------------------------------------------------| +| `replicas` | Number of nodes | `1` | +| `podDisruptionBudget.minAvailable` | Pod disruption minimum available | `nil` | +| `podDisruptionBudget.maxUnavailable` | Pod disruption maximum unavailable | `nil` | +| `podDisruptionBudget.apiVersion` | Pod disruption apiVersion | `nil` | +| `deploymentStrategy` | Deployment strategy | `{ "type": "RollingUpdate" }` | +| `livenessProbe` | Liveness Probe settings | `{ "httpGet": { "path": "/api/health", "port": 3000 } "initialDelaySeconds": 60, "timeoutSeconds": 30, "failureThreshold": 10 }` | +| `readinessProbe` | Readiness Probe settings | `{ "httpGet": { "path": "/api/health", "port": 3000 } }`| +| `securityContext` | Deployment securityContext | `{"runAsUser": 472, "runAsGroup": 472, "fsGroup": 472}` | +| `priorityClassName` | Name of Priority Class to assign pods | `nil` | +| `image.registry` | Image registry | `docker.io` | +| `image.repository` | Image repository | `grafana/grafana` | +| `image.tag` | Overrides the Grafana image tag whose default is the chart appVersion (`Must be >= 5.0.0`) | `` | +| `image.sha` | Image sha (optional) | `` | +| `image.pullPolicy` | Image pull policy | `IfNotPresent` | +| `image.pullSecrets` | Image pull secrets (can be templated) | `[]` | +| `service.enabled` | Enable grafana service | `true` | +| `service.type` | Kubernetes service type | `ClusterIP` | +| `service.port` | Kubernetes port where service is exposed | `80` | +| `service.portName` | Name of the port on the service | `service` | +| `service.appProtocol` | Adds the appProtocol field to the service | `` | +| `service.targetPort` | Internal service is port | `3000` | +| `service.nodePort` | Kubernetes service nodePort | `nil` | +| `service.annotations` | Service annotations (can be templated) | `{}` | +| `service.labels` | Custom labels | `{}` | +| `service.clusterIP` | internal cluster service IP | `nil` | +| `service.loadBalancerIP` | IP address to assign to load balancer (if supported) | `nil` | +| `service.loadBalancerSourceRanges` | list of IP CIDRs allowed access to lb (if supported) | `[]` | +| `service.externalIPs` | service external IP addresses | `[]` | +| `service.externalTrafficPolicy` | change the default externalTrafficPolicy | `nil` | +| `headlessService` | Create a headless service | `false` | +| `extraExposePorts` | Additional service ports for sidecar containers| `[]` | +| `hostAliases` | adds rules to the pod's /etc/hosts | `[]` | +| `ingress.enabled` | Enables Ingress | `false` | +| `ingress.annotations` | Ingress annotations (values are templated) | `{}` | +| `ingress.labels` | Custom labels | `{}` | +| `ingress.path` | Ingress accepted path | `/` | +| `ingress.pathType` | Ingress type of path | `Prefix` | +| `ingress.hosts` | Ingress accepted hostnames | `["chart-example.local"]` | +| `ingress.extraPaths` | Ingress extra paths to prepend to every host configuration. Useful when configuring [custom actions with AWS ALB Ingress Controller](https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.6/guide/ingress/annotations/#actions). Requires `ingress.hosts` to have one or more host entries. | `[]` | +| `ingress.tls` | Ingress TLS configuration | `[]` | +| `ingress.ingressClassName` | Ingress Class Name. MAY be required for Kubernetes versions >= 1.18 | `""` | +| `resources` | CPU/Memory resource requests/limits | `{}` | +| `nodeSelector` | Node labels for pod assignment | `{}` | +| `tolerations` | Toleration labels for pod assignment | `[]` | +| `affinity` | Affinity settings for pod assignment | `{}` | +| `extraInitContainers` | Init containers to add to the grafana pod | `{}` | +| `extraContainers` | Sidecar containers to add to the grafana pod | `""` | +| `extraContainerVolumes` | Volumes that can be mounted in sidecar containers | `[]` | +| `extraLabels` | Custom labels for all manifests | `{}` | +| `schedulerName` | Name of the k8s scheduler (other than default) | `nil` | +| `persistence.enabled` | Use persistent volume to store data | `false` | +| `persistence.type` | Type of persistence (`pvc` or `statefulset`) | `pvc` | +| `persistence.size` | Size of persistent volume claim | `10Gi` | +| `persistence.existingClaim` | Use an existing PVC to persist data (can be templated) | `nil` | +| `persistence.storageClassName` | Type of persistent volume claim | `nil` | +| `persistence.accessModes` | Persistence access modes | `[ReadWriteOnce]` | +| `persistence.annotations` | PersistentVolumeClaim annotations | `{}` | +| `persistence.finalizers` | PersistentVolumeClaim finalizers | `[ "kubernetes.io/pvc-protection" ]` | +| `persistence.extraPvcLabels` | Extra labels to apply to a PVC. | `{}` | +| `persistence.subPath` | Mount a sub dir of the persistent volume (can be templated) | `nil` | +| `persistence.inMemory.enabled` | If persistence is not enabled, whether to mount the local storage in-memory to improve performance | `false` | +| `persistence.inMemory.sizeLimit` | SizeLimit for the in-memory local storage | `nil` | +| `initChownData.enabled` | If false, don't reset data ownership at startup | true | +| `initChownData.image.registry` | init-chown-data container image registry | `docker.io` | +| `initChownData.image.repository` | init-chown-data container image repository | `busybox` | +| `initChownData.image.tag` | init-chown-data container image tag | `1.31.1` | +| `initChownData.image.sha` | init-chown-data container image sha (optional)| `""` | +| `initChownData.image.pullPolicy` | init-chown-data container image pull policy | `IfNotPresent` | +| `initChownData.resources` | init-chown-data pod resource requests & limits | `{}` | +| `schedulerName` | Alternate scheduler name | `nil` | +| `env` | Extra environment variables passed to pods | `{}` | +| `envValueFrom` | Environment variables from alternate sources. See the API docs on [EnvVarSource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#envvarsource-v1-core) for format details. Can be templated | `{}` | +| `envFromSecret` | Name of a Kubernetes secret (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | `""` | +| `envFromSecrets` | List of Kubernetes secrets (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | `[]` | +| `envFromConfigMaps` | List of Kubernetes ConfigMaps (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | `[]` | +| `envRenderSecret` | Sensible environment variables passed to pods and stored as secret. (passed through [tpl](https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function)) | `{}` | +| `enableServiceLinks` | Inject Kubernetes services as environment variables. | `true` | +| `extraSecretMounts` | Additional grafana server secret mounts | `[]` | +| `extraVolumeMounts` | Additional grafana server volume mounts | `[]` | +| `extraVolumes` | Additional Grafana server volumes | `[]` | +| `automountServiceAccountToken` | Mounted the service account token on the grafana pod. Mandatory, if sidecars are enabled | `true` | +| `createConfigmap` | Enable creating the grafana configmap | `true` | +| `extraConfigmapMounts` | Additional grafana server configMap volume mounts (values are templated) | `[]` | +| `extraEmptyDirMounts` | Additional grafana server emptyDir volume mounts | `[]` | +| `plugins` | Plugins to be loaded along with Grafana | `[]` | +| `datasources` | Configure grafana datasources (passed through tpl) | `{}` | +| `alerting` | Configure grafana alerting (passed through tpl) | `{}` | +| `notifiers` | Configure grafana notifiers | `{}` | +| `dashboardProviders` | Configure grafana dashboard providers | `{}` | +| `dashboards` | Dashboards to import | `{}` | +| `dashboardsConfigMaps` | ConfigMaps reference that contains dashboards | `{}` | +| `grafana.ini` | Grafana's primary configuration | `{}` | +| `global.imageRegistry` | Global image pull registry for all images. | `null` | +| `global.imagePullSecrets` | Global image pull secrets (can be templated). Allows either an array of {name: pullSecret} maps (k8s-style), or an array of strings (more common helm-style). | `[]` | +| `ldap.enabled` | Enable LDAP authentication | `false` | +| `ldap.existingSecret` | The name of an existing secret containing the `ldap.toml` file, this must have the key `ldap-toml`. | `""` | +| `ldap.config` | Grafana's LDAP configuration | `""` | +| `annotations` | Deployment annotations | `{}` | +| `labels` | Deployment labels | `{}` | +| `podAnnotations` | Pod annotations | `{}` | +| `podLabels` | Pod labels | `{}` | +| `podPortName` | Name of the grafana port on the pod | `grafana` | +| `lifecycleHooks` | Lifecycle hooks for podStart and preStop [Example](https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/#define-poststart-and-prestop-handlers) | `{}` | +| `sidecar.image.registry` | Sidecar image registry | `quay.io` | +| `sidecar.image.repository` | Sidecar image repository | `kiwigrid/k8s-sidecar` | +| `sidecar.image.tag` | Sidecar image tag | `1.26.0` | +| `sidecar.image.sha` | Sidecar image sha (optional) | `""` | +| `sidecar.imagePullPolicy` | Sidecar image pull policy | `IfNotPresent` | +| `sidecar.resources` | Sidecar resources | `{}` | +| `sidecar.securityContext` | Sidecar securityContext | `{}` | +| `sidecar.enableUniqueFilenames` | Sets the kiwigrid/k8s-sidecar UNIQUE_FILENAMES environment variable. If set to `true` the sidecar will create unique filenames where duplicate data keys exist between ConfigMaps and/or Secrets within the same or multiple Namespaces. | `false` | +| `sidecar.alerts.enabled` | Enables the cluster wide search for alerts and adds/updates/deletes them in grafana |`false` | +| `sidecar.alerts.label` | Label that config maps with alerts should have to be added | `grafana_alert` | +| `sidecar.alerts.labelValue` | Label value that config maps with alerts should have to be added | `""` | +| `sidecar.alerts.searchNamespace` | Namespaces list. If specified, the sidecar will search for alerts config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It's also possible to specify ALL to search in all namespaces. | `nil` | +| `sidecar.alerts.watchMethod` | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | `WATCH` | +| `sidecar.alerts.resource` | Should the sidecar looks into secrets, configmaps or both. | `both` | +| `sidecar.alerts.reloadURL` | Full url of datasource configuration reload API endpoint, to invoke after a config-map change | `"http://localhost:3000/api/admin/provisioning/alerting/reload"` | +| `sidecar.alerts.skipReload` | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | `false` | +| `sidecar.alerts.initAlerts` | Set to true to deploy the alerts sidecar as an initContainer. This is needed if skipReload is true, to load any alerts defined at startup time. | `false` | +| `sidecar.alerts.extraMounts` | Additional alerts sidecar volume mounts. | `[]` | +| `sidecar.dashboards.enabled` | Enables the cluster wide search for dashboards and adds/updates/deletes them in grafana | `false` | +| `sidecar.dashboards.SCProvider` | Enables creation of sidecar provider | `true` | +| `sidecar.dashboards.provider.name` | Unique name of the grafana provider | `sidecarProvider` | +| `sidecar.dashboards.provider.orgid` | Id of the organisation, to which the dashboards should be added | `1` | +| `sidecar.dashboards.provider.folder` | Logical folder in which grafana groups dashboards | `""` | +| `sidecar.dashboards.provider.disableDelete` | Activate to avoid the deletion of imported dashboards | `false` | +| `sidecar.dashboards.provider.allowUiUpdates` | Allow updating provisioned dashboards from the UI | `false` | +| `sidecar.dashboards.provider.type` | Provider type | `file` | +| `sidecar.dashboards.provider.foldersFromFilesStructure` | Allow Grafana to replicate dashboard structure from filesystem. | `false` | +| `sidecar.dashboards.watchMethod` | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | `WATCH` | +| `sidecar.skipTlsVerify` | Set to true to skip tls verification for kube api calls | `nil` | +| `sidecar.dashboards.label` | Label that config maps with dashboards should have to be added | `grafana_dashboard` | +| `sidecar.dashboards.labelValue` | Label value that config maps with dashboards should have to be added | `""` | +| `sidecar.dashboards.folder` | Folder in the pod that should hold the collected dashboards (unless `sidecar.dashboards.defaultFolderName` is set). This path will be mounted. | `/tmp/dashboards` | +| `sidecar.dashboards.folderAnnotation` | The annotation the sidecar will look for in configmaps to override the destination folder for files | `nil` | +| `sidecar.dashboards.defaultFolderName` | The default folder name, it will create a subfolder under the `sidecar.dashboards.folder` and put dashboards in there instead | `nil` | +| `sidecar.dashboards.searchNamespace` | Namespaces list. If specified, the sidecar will search for dashboards config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It's also possible to specify ALL to search in all namespaces. | `nil` | +| `sidecar.dashboards.script` | Absolute path to shell script to execute after a configmap got reloaded. | `nil` | +| `sidecar.dashboards.reloadURL` | Full url of dashboards configuration reload API endpoint, to invoke after a config-map change | `"http://localhost:3000/api/admin/provisioning/dashboards/reload"` | +| `sidecar.dashboards.skipReload` | Enabling this omits defining the REQ_USERNAME, REQ_PASSWORD, REQ_URL and REQ_METHOD environment variables | `false` | +| `sidecar.dashboards.resource` | Should the sidecar looks into secrets, configmaps or both. | `both` | +| `sidecar.dashboards.extraMounts` | Additional dashboard sidecar volume mounts. | `[]` | +| `sidecar.datasources.enabled` | Enables the cluster wide search for datasources and adds/updates/deletes them in grafana |`false` | +| `sidecar.datasources.label` | Label that config maps with datasources should have to be added | `grafana_datasource` | +| `sidecar.datasources.labelValue` | Label value that config maps with datasources should have to be added | `""` | +| `sidecar.datasources.searchNamespace` | Namespaces list. If specified, the sidecar will search for datasources config-maps inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It's also possible to specify ALL to search in all namespaces. | `nil` | +| `sidecar.datasources.watchMethod` | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | `WATCH` | +| `sidecar.datasources.resource` | Should the sidecar looks into secrets, configmaps or both. | `both` | +| `sidecar.datasources.reloadURL` | Full url of datasource configuration reload API endpoint, to invoke after a config-map change | `"http://localhost:3000/api/admin/provisioning/datasources/reload"` | +| `sidecar.datasources.skipReload` | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | `false` | +| `sidecar.datasources.initDatasources` | Set to true to deploy the datasource sidecar as an initContainer in addition to a container. This is needed if skipReload is true, to load any datasources defined at startup time. | `false` | +| `sidecar.notifiers.enabled` | Enables the cluster wide search for notifiers and adds/updates/deletes them in grafana | `false` | +| `sidecar.notifiers.label` | Label that config maps with notifiers should have to be added | `grafana_notifier` | +| `sidecar.notifiers.labelValue` | Label value that config maps with notifiers should have to be added | `""` | +| `sidecar.notifiers.searchNamespace` | Namespaces list. If specified, the sidecar will search for notifiers config-maps (or secrets) inside these namespaces. Otherwise the namespace in which the sidecar is running will be used. It's also possible to specify ALL to search in all namespaces. | `nil` | +| `sidecar.notifiers.watchMethod` | Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. | `WATCH` | +| `sidecar.notifiers.resource` | Should the sidecar looks into secrets, configmaps or both. | `both` | +| `sidecar.notifiers.reloadURL` | Full url of notifier configuration reload API endpoint, to invoke after a config-map change | `"http://localhost:3000/api/admin/provisioning/notifications/reload"` | +| `sidecar.notifiers.skipReload` | Enabling this omits defining the REQ_URL and REQ_METHOD environment variables | `false` | +| `sidecar.notifiers.initNotifiers` | Set to true to deploy the notifier sidecar as an initContainer in addition to a container. This is needed if skipReload is true, to load any notifiers defined at startup time. | `false` | +| `smtp.existingSecret` | The name of an existing secret containing the SMTP credentials. | `""` | +| `smtp.userKey` | The key in the existing SMTP secret containing the username. | `"user"` | +| `smtp.passwordKey` | The key in the existing SMTP secret containing the password. | `"password"` | +| `admin.existingSecret` | The name of an existing secret containing the admin credentials (can be templated). | `""` | +| `admin.userKey` | The key in the existing admin secret containing the username. | `"admin-user"` | +| `admin.passwordKey` | The key in the existing admin secret containing the password. | `"admin-password"` | +| `serviceAccount.automountServiceAccountToken` | Automount the service account token on all pods where is service account is used | `false` | +| `serviceAccount.annotations` | ServiceAccount annotations | | +| `serviceAccount.create` | Create service account | `true` | +| `serviceAccount.labels` | ServiceAccount labels | `{}` | +| `serviceAccount.name` | Service account name to use, when empty will be set to created account if `serviceAccount.create` is set else to `default` | `` | +| `serviceAccount.nameTest` | Service account name to use for test, when empty will be set to created account if `serviceAccount.create` is set else to `default` | `nil` | +| `rbac.create` | Create and use RBAC resources | `true` | +| `rbac.namespaced` | Creates Role and Rolebinding instead of the default ClusterRole and ClusteRoleBindings for the grafana instance | `false` | +| `rbac.useExistingRole` | Set to a rolename to use existing role - skipping role creating - but still doing serviceaccount and rolebinding to the rolename set here. | `nil` | +| `rbac.pspEnabled` | Create PodSecurityPolicy (with `rbac.create`, grant roles permissions as well) | `false` | +| `rbac.pspUseAppArmor` | Enforce AppArmor in created PodSecurityPolicy (requires `rbac.pspEnabled`) | `false` | +| `rbac.extraRoleRules` | Additional rules to add to the Role | [] | +| `rbac.extraClusterRoleRules` | Additional rules to add to the ClusterRole | [] | +| `command` | Define command to be executed by grafana container at startup | `nil` | +| `args` | Define additional args if command is used | `nil` | +| `testFramework.enabled` | Whether to create test-related resources | `true` | +| `testFramework.image.registry` | `test-framework` image registry. | `docker.io` | +| `testFramework.image.repository` | `test-framework` image repository. | `bats/bats` | +| `testFramework.image.tag` | `test-framework` image tag. | `v1.4.1` | +| `testFramework.imagePullPolicy` | `test-framework` image pull policy. | `IfNotPresent` | +| `testFramework.securityContext` | `test-framework` securityContext | `{}` | +| `downloadDashboards.env` | Environment variables to be passed to the `download-dashboards` container | `{}` | +| `downloadDashboards.envFromSecret` | Name of a Kubernetes secret (must be manually created in the same namespace) containing values to be added to the environment. Can be templated | `""` | +| `downloadDashboards.resources` | Resources of `download-dashboards` container | `{}` | +| `downloadDashboardsImage.registry` | Curl docker image registry | `docker.io` | +| `downloadDashboardsImage.repository` | Curl docker image repository | `curlimages/curl` | +| `downloadDashboardsImage.tag` | Curl docker image tag | `7.73.0` | +| `downloadDashboardsImage.sha` | Curl docker image sha (optional) | `""` | +| `downloadDashboardsImage.pullPolicy` | Curl docker image pull policy | `IfNotPresent` | +| `namespaceOverride` | Override the deployment namespace | `""` (`Release.Namespace`) | +| `serviceMonitor.enabled` | Use servicemonitor from prometheus operator | `false` | +| `serviceMonitor.namespace` | Namespace this servicemonitor is installed in | | +| `serviceMonitor.interval` | How frequently Prometheus should scrape | `1m` | +| `serviceMonitor.path` | Path to scrape | `/metrics` | +| `serviceMonitor.scheme` | Scheme to use for metrics scraping | `http` | +| `serviceMonitor.tlsConfig` | TLS configuration block for the endpoint | `{}` | +| `serviceMonitor.labels` | Labels for the servicemonitor passed to Prometheus Operator | `{}` | +| `serviceMonitor.scrapeTimeout` | Timeout after which the scrape is ended | `30s` | +| `serviceMonitor.relabelings` | RelabelConfigs to apply to samples before scraping. | `[]` | +| `serviceMonitor.metricRelabelings` | MetricRelabelConfigs to apply to samples before ingestion. | `[]` | +| `revisionHistoryLimit` | Number of old ReplicaSets to retain | `10` | +| `imageRenderer.enabled` | Enable the image-renderer deployment & service | `false` | +| `imageRenderer.image.registry` | image-renderer Image registry | `docker.io` | +| `imageRenderer.image.repository` | image-renderer Image repository | `grafana/grafana-image-renderer` | +| `imageRenderer.image.tag` | image-renderer Image tag | `latest` | +| `imageRenderer.image.sha` | image-renderer Image sha (optional) | `""` | +| `imageRenderer.image.pullPolicy` | image-renderer ImagePullPolicy | `Always` | +| `imageRenderer.env` | extra env-vars for image-renderer | `{}` | +| `imageRenderer.envValueFrom` | Environment variables for image-renderer from alternate sources. See the API docs on [EnvVarSource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.17/#envvarsource-v1-core) for format details. Can be templated | `{}` | +| `imageRenderer.serviceAccountName` | image-renderer deployment serviceAccountName | `""` | +| `imageRenderer.securityContext` | image-renderer deployment securityContext | `{}` | +| `imageRenderer.podAnnotations ` | image-renderer image-renderer pod annotation | `{}` | +| `imageRenderer.hostAliases` | image-renderer deployment Host Aliases | `[]` | +| `imageRenderer.priorityClassName` | image-renderer deployment priority class | `''` | +| `imageRenderer.service.enabled` | Enable the image-renderer service | `true` | +| `imageRenderer.service.portName` | image-renderer service port name | `http` | +| `imageRenderer.service.port` | image-renderer port used by deployment | `8081` | +| `imageRenderer.service.targetPort` | image-renderer service port used by service | `8081` | +| `imageRenderer.appProtocol` | Adds the appProtocol field to the service | `` | +| `imageRenderer.grafanaSubPath` | Grafana sub path to use for image renderer callback url | `''` | +| `imageRenderer.podPortName` | name of the image-renderer port on the pod | `http` | +| `imageRenderer.revisionHistoryLimit` | number of image-renderer replica sets to keep | `10` | +| `imageRenderer.networkPolicy.limitIngress` | Enable a NetworkPolicy to limit inbound traffic from only the created grafana pods | `true` | +| `imageRenderer.networkPolicy.limitEgress` | Enable a NetworkPolicy to limit outbound traffic to only the created grafana pods | `false` | +| `imageRenderer.resources` | Set resource limits for image-renderer pods | `{}` | +| `imageRenderer.nodeSelector` | Node labels for pod assignment | `{}` | +| `imageRenderer.tolerations` | Toleration labels for pod assignment | `[]` | +| `imageRenderer.affinity` | Affinity settings for pod assignment | `{}` | +| `networkPolicy.enabled` | Enable creation of NetworkPolicy resources. | `false` | +| `networkPolicy.allowExternal` | Don't require client label for connections | `true` | +| `networkPolicy.explicitNamespacesSelector` | A Kubernetes LabelSelector to explicitly select namespaces from which traffic could be allowed | `{}` | +| `networkPolicy.ingress` | Enable the creation of an ingress network policy | `true` | +| `networkPolicy.egress.enabled` | Enable the creation of an egress network policy | `false` | +| `networkPolicy.egress.ports` | An array of ports to allow for the egress | `[]` | +| `enableKubeBackwardCompatibility` | Enable backward compatibility of kubernetes where pod's defintion version below 1.13 doesn't have the enableServiceLinks option | `false` | + +### Example ingress with path + +With grafana 6.3 and above + +```yaml +grafana.ini: + server: + domain: monitoring.example.com + root_url: "%(protocol)s://%(domain)s/grafana" + serve_from_sub_path: true +ingress: + enabled: true + hosts: + - "monitoring.example.com" + path: "/grafana" +``` + +### Example of extraVolumeMounts and extraVolumes + +Configure additional volumes with `extraVolumes` and volume mounts with `extraVolumeMounts`. + +Example for `extraVolumeMounts` and corresponding `extraVolumes`: + +```yaml +extraVolumeMounts: + - name: plugins + mountPath: /var/lib/grafana/plugins + subPath: configs/grafana/plugins + readOnly: false + - name: dashboards + mountPath: /var/lib/grafana/dashboards + hostPath: /usr/shared/grafana/dashboards + readOnly: false + +extraVolumes: + - name: plugins + existingClaim: existing-grafana-claim + - name: dashboards + hostPath: /usr/shared/grafana/dashboards +``` + +Volumes default to `emptyDir`. Set to `persistentVolumeClaim`, +`hostPath`, `csi`, or `configMap` for other types. For a +`persistentVolumeClaim`, specify an existing claim name with +`existingClaim`. + +## Import dashboards + +There are a few methods to import dashboards to Grafana. Below are some examples and explanations as to how to use each method: + +```yaml +dashboards: + default: + some-dashboard: + json: | + { + "annotations": + + ... + # Complete json file here + ... + + "title": "Some Dashboard", + "uid": "abcd1234", + "version": 1 + } + custom-dashboard: + # This is a path to a file inside the dashboards directory inside the chart directory + file: dashboards/custom-dashboard.json + prometheus-stats: + # Ref: https://grafana.com/dashboards/2 + gnetId: 2 + revision: 2 + datasource: Prometheus + loki-dashboard-quick-search: + gnetId: 12019 + revision: 2 + datasource: + - name: DS_PROMETHEUS + value: Prometheus + - name: DS_LOKI + value: Loki + local-dashboard: + url: https://raw.githubusercontent.com/user/repository/master/dashboards/dashboard.json +``` + +## BASE64 dashboards + +Dashboards could be stored on a server that does not return JSON directly and instead of it returns a Base64 encoded file (e.g. Gerrit) +A new parameter has been added to the url use case so if you specify a b64content value equals to true after the url entry a Base64 decoding is applied before save the file to disk. +If this entry is not set or is equals to false not decoding is applied to the file before saving it to disk. + +### Gerrit use case + +Gerrit API for download files has the following schema: where {project-name} and +{file-id} usually has '/' in their values and so they MUST be replaced by %2F so if project-name is user/repo, branch-id is master and file-id is equals to dir1/dir2/dashboard +the url value is + +## Sidecar for dashboards + +If the parameter `sidecar.dashboards.enabled` is set, a sidecar container is deployed in the grafana +pod. This container watches all configmaps (or secrets) in the cluster and filters out the ones with +a label as defined in `sidecar.dashboards.label`. The files defined in those configmaps are written +to a folder and accessed by grafana. Changes to the configmaps are monitored and the imported +dashboards are deleted/updated. + +A recommendation is to use one configmap per dashboard, as a reduction of multiple dashboards inside +one configmap is currently not properly mirrored in grafana. + +Example dashboard config: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: sample-grafana-dashboard + labels: + grafana_dashboard: "1" +data: + k8s-dashboard.json: |- + [...] +``` + +## Sidecar for datasources + +If the parameter `sidecar.datasources.enabled` is set, an init container is deployed in the grafana +pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and +filters out the ones with a label as defined in `sidecar.datasources.label`. The files defined in +those secrets are written to a folder and accessed by grafana on startup. Using these yaml files, +the data sources in grafana can be imported. + +Should you aim for reloading datasources in Grafana each time the config is changed, set `sidecar.datasources.skipReload: false` and adjust `sidecar.datasources.reloadURL` to `http://..svc.cluster.local/api/admin/provisioning/datasources/reload`. + +Secrets are recommended over configmaps for this usecase because datasources usually contain private +data like usernames and passwords. Secrets are the more appropriate cluster resource to manage those. + +Example values to add a postgres datasource as a kubernetes secret: +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: grafana-datasources + labels: + grafana_datasource: 'true' # default value for: sidecar.datasources.label +stringData: + pg-db.yaml: |- + apiVersion: 1 + datasources: + - name: My pg db datasource + type: postgres + url: my-postgresql-db:5432 + user: db-readonly-user + secureJsonData: + password: 'SUperSEcretPa$$word' + jsonData: + database: my_datase + sslmode: 'disable' # disable/require/verify-ca/verify-full + maxOpenConns: 0 # Grafana v5.4+ + maxIdleConns: 2 # Grafana v5.4+ + connMaxLifetime: 14400 # Grafana v5.4+ + postgresVersion: 1000 # 903=9.3, 904=9.4, 905=9.5, 906=9.6, 1000=10 + timescaledb: false + # allow users to edit datasources from the UI. + editable: false +``` + +Example values to add a datasource adapted from [Grafana](http://docs.grafana.org/administration/provisioning/#example-datasource-config-file): + +```yaml +datasources: + datasources.yaml: + apiVersion: 1 + datasources: + # name of the datasource. Required + - name: Graphite + # datasource type. Required + type: graphite + # access mode. proxy or direct (Server or Browser in the UI). Required + access: proxy + # org id. will default to orgId 1 if not specified + orgId: 1 + # url + url: http://localhost:8080 + # database password, if used + password: + # database user, if used + user: + # database name, if used + database: + # enable/disable basic auth + basicAuth: + # basic auth username + basicAuthUser: + # basic auth password + basicAuthPassword: + # enable/disable with credentials headers + withCredentials: + # mark as default datasource. Max one per org + isDefault: + # fields that will be converted to json and stored in json_data + jsonData: + graphiteVersion: "1.1" + tlsAuth: true + tlsAuthWithCACert: true + # json object of data that will be encrypted. + secureJsonData: + tlsCACert: "..." + tlsClientCert: "..." + tlsClientKey: "..." + version: 1 + # allow users to edit datasources from the UI. + editable: false +``` + +## Sidecar for notifiers + +If the parameter `sidecar.notifiers.enabled` is set, an init container is deployed in the grafana +pod. This container lists all secrets (or configmaps, though not recommended) in the cluster and +filters out the ones with a label as defined in `sidecar.notifiers.label`. The files defined in +those secrets are written to a folder and accessed by grafana on startup. Using these yaml files, +the notification channels in grafana can be imported. The secrets must be created before +`helm install` so that the notifiers init container can list the secrets. + +Secrets are recommended over configmaps for this usecase because alert notification channels usually contain +private data like SMTP usernames and passwords. Secrets are the more appropriate cluster resource to manage those. + +Example datasource config adapted from [Grafana](https://grafana.com/docs/grafana/latest/administration/provisioning/#alert-notification-channels): + +```yaml +notifiers: + - name: notification-channel-1 + type: slack + uid: notifier1 + # either + org_id: 2 + # or + org_name: Main Org. + is_default: true + send_reminder: true + frequency: 1h + disable_resolve_message: false + # See `Supported Settings` section for settings supporter for each + # alert notification type. + settings: + recipient: 'XXX' + token: 'xoxb' + uploadImage: true + url: https://slack.com + +delete_notifiers: + - name: notification-channel-1 + uid: notifier1 + org_id: 2 + - name: notification-channel-2 + # default org_id: 1 +``` + +## Sidecar for alerting resources + +If the parameter `sidecar.alerts.enabled` is set, a sidecar container is deployed in the grafana +pod. This container watches all configmaps (or secrets) in the cluster (namespace defined by `sidecar.alerts.searchNamespace`) and filters out the ones with +a label as defined in `sidecar.alerts.label` (default is `grafana_alert`). The files defined in those configmaps are written +to a folder and accessed by grafana. Changes to the configmaps are monitored and the imported alerting resources are updated, however, deletions are a little more complicated (see below). + +This sidecar can be used to provision alert rules, contact points, notification policies, notification templates and mute timings as shown in [Grafana Documentation](https://grafana.com/docs/grafana/next/alerting/set-up/provision-alerting-resources/file-provisioning/). + +To fetch the alert config which will be provisioned, use the alert provisioning API ([Grafana Documentation](https://grafana.com/docs/grafana/next/developers/http_api/alerting_provisioning/)). +You can use either JSON or YAML format. + +Example config for an alert rule: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: sample-grafana-alert + labels: + grafana_alert: "1" +data: + k8s-alert.yml: |- + apiVersion: 1 + groups: + - orgId: 1 + name: k8s-alert + [...] +``` + +To delete provisioned alert rules is a two step process, you need to delete the configmap which defined the alert rule +and then create a configuration which deletes the alert rule. + +Example deletion configuration: +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: delete-sample-grafana-alert + namespace: monitoring + labels: + grafana_alert: "1" +data: + delete-k8s-alert.yml: |- + apiVersion: 1 + deleteRules: + - orgId: 1 + uid: 16624780-6564-45dc-825c-8bded4ad92d3 +``` + +## Statically provision alerting resources +If you don't need to change alerting resources (alert rules, contact points, notification policies and notification templates) regularly you could use the `alerting` config option instead of the sidecar option above. +This will grab the alerting config and apply it statically at build time for the helm file. + +There are two methods to statically provision alerting configuration in Grafana. Below are some examples and explanations as to how to use each method: + +```yaml +alerting: + team1-alert-rules.yaml: + file: alerting/team1/rules.yaml + team2-alert-rules.yaml: + file: alerting/team2/rules.yaml + team3-alert-rules.yaml: + file: alerting/team3/rules.yaml + notification-policies.yaml: + file: alerting/shared/notification-policies.yaml + notification-templates.yaml: + file: alerting/shared/notification-templates.yaml + contactpoints.yaml: + apiVersion: 1 + contactPoints: + - orgId: 1 + name: Slack channel + receivers: + - uid: default-receiver + type: slack + settings: + # Webhook URL to be filled in + url: "" + # We need to escape double curly braces for the tpl function. + text: '{{ `{{ template "default.message" . }}` }}' + title: '{{ `{{ template "default.title" . }}` }}' +``` + +The two possibilities for static alerting resource provisioning are: + +* Inlining the file contents as shown for contact points in the above example. +* Importing a file using a relative path starting from the chart root directory as shown for the alert rules in the above example. + +### Important notes on file provisioning + +* The format of the files is defined in the [Grafana documentation](https://grafana.com/docs/grafana/next/alerting/set-up/provision-alerting-resources/file-provisioning/) on file provisioning. +* The chart supports importing YAML and JSON files. +* The filename must be unique, otherwise one volume mount will overwrite the other. +* In case of inlining, double curly braces that arise from the Grafana configuration format and are not intended as templates for the chart must be escaped. +* The number of total files under `alerting:` is not limited. Each file will end up as a volume mount in the corresponding provisioning folder of the deployed Grafana instance. +* The file size for each import is limited by what the function `.Files.Get` can handle, which suffices for most cases. + +## How to serve Grafana with a path prefix (/grafana) + +In order to serve Grafana with a prefix (e.g., ), add the following to your values.yaml. + +```yaml +ingress: + enabled: true + annotations: + kubernetes.io/ingress.class: "nginx" + nginx.ingress.kubernetes.io/rewrite-target: /$1 + nginx.ingress.kubernetes.io/use-regex: "true" + + path: /grafana/?(.*) + hosts: + - k8s.example.dev + +grafana.ini: + server: + root_url: http://localhost:3000/grafana # this host can be localhost +``` + +## How to securely reference secrets in grafana.ini + +This example uses Grafana [file providers](https://grafana.com/docs/grafana/latest/administration/configuration/#file-provider) for secret values and the `extraSecretMounts` configuration flag (Additional grafana server secret mounts) to mount the secrets. + +In grafana.ini: + +```yaml +grafana.ini: + [auth.generic_oauth] + enabled = true + client_id = $__file{/etc/secrets/auth_generic_oauth/client_id} + client_secret = $__file{/etc/secrets/auth_generic_oauth/client_secret} +``` + +Existing secret, or created along with helm: + +```yaml +--- +apiVersion: v1 +kind: Secret +metadata: + name: auth-generic-oauth-secret +type: Opaque +stringData: + client_id: + client_secret: +``` + +Include in the `extraSecretMounts` configuration flag: + +```yaml +- extraSecretMounts: + - name: auth-generic-oauth-secret-mount + secretName: auth-generic-oauth-secret + defaultMode: 0440 + mountPath: /etc/secrets/auth_generic_oauth + readOnly: true +``` + +### extraSecretMounts using a Container Storage Interface (CSI) provider + +This example uses a CSI driver e.g. retrieving secrets using [Azure Key Vault Provider](https://github.com/Azure/secrets-store-csi-driver-provider-azure) + +```yaml +- extraSecretMounts: + - name: secrets-store-inline + mountPath: /run/secrets + readOnly: true + csi: + driver: secrets-store.csi.k8s.io + readOnly: true + volumeAttributes: + secretProviderClass: "my-provider" + nodePublishSecretRef: + name: akv-creds +``` + +## Image Renderer Plug-In + +This chart supports enabling [remote image rendering](https://github.com/grafana/grafana-image-renderer/blob/master/README.md#run-in-docker) + +```yaml +imageRenderer: + enabled: true +``` + +### Image Renderer NetworkPolicy + +By default the image-renderer pods will have a network policy which only allows ingress traffic from the created grafana instance + +### High Availability for unified alerting + +If you want to run Grafana in a high availability cluster you need to enable +the headless service by setting `headlessService: true` in your `values.yaml` +file. + +As next step you have to setup the `grafana.ini` in your `values.yaml` in a way +that it will make use of the headless service to obtain all the IPs of the +cluster. You should replace ``{{ Name }}`` with the name of your helm deployment. + +```yaml +grafana.ini: + ... + unified_alerting: + enabled: true + ha_peers: {{ Name }}-headless:9094 + ha_listen_address: ${POD_IP}:9094 + ha_advertise_address: ${POD_IP}:9094 + + alerting: + enabled: false +``` diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/dashboards/custom-dashboard.json b/charts/rancher-project-monitoring/0.4.2/charts/grafana/dashboards/custom-dashboard.json new file mode 100644 index 00000000..9e26dfee --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/dashboards/custom-dashboard.json @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/NOTES.txt b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/NOTES.txt new file mode 100644 index 00000000..d86419fe --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/NOTES.txt @@ -0,0 +1,55 @@ +1. Get your '{{ .Values.adminUser }}' user password by running: + + kubectl get secret --namespace {{ include "grafana.namespace" . }} {{ .Values.admin.existingSecret | default (include "grafana.fullname" .) }} -o jsonpath="{.data.{{ .Values.admin.passwordKey | default "admin-password" }}}" | base64 --decode ; echo + + +2. The Grafana server can be accessed via port {{ .Values.service.port }} on the following DNS name from within your cluster: + + {{ include "grafana.fullname" . }}.{{ include "grafana.namespace" . }}.svc.cluster.local +{{ if .Values.ingress.enabled }} + If you bind grafana to 80, please update values in values.yaml and reinstall: + ``` + securityContext: + runAsUser: 0 + runAsGroup: 0 + fsGroup: 0 + + command: + - "setcap" + - "'cap_net_bind_service=+ep'" + - "/usr/sbin/grafana-server &&" + - "sh" + - "/run.sh" + ``` + Details refer to https://grafana.com/docs/installation/configuration/#http-port. + Or grafana would always crash. + + From outside the cluster, the server URL(s) are: + {{- range .Values.ingress.hosts }} + http://{{ . }} + {{- end }} +{{- else }} + Get the Grafana URL to visit by running these commands in the same shell: + {{- if contains "NodePort" .Values.service.type }} + export NODE_PORT=$(kubectl get --namespace {{ include "grafana.namespace" . }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "grafana.fullname" . }}) + export NODE_IP=$(kubectl get nodes --namespace {{ include "grafana.namespace" . }} -o jsonpath="{.items[0].status.addresses[0].address}") + echo http://$NODE_IP:$NODE_PORT + {{- else if contains "LoadBalancer" .Values.service.type }} + NOTE: It may take a few minutes for the LoadBalancer IP to be available. + You can watch the status of by running 'kubectl get svc --namespace {{ include "grafana.namespace" . }} -w {{ include "grafana.fullname" . }}' + export SERVICE_IP=$(kubectl get svc --namespace {{ include "grafana.namespace" . }} {{ include "grafana.fullname" . }} -o jsonpath='{.status.loadBalancer.ingress[0].ip}') + http://$SERVICE_IP:{{ .Values.service.port -}} + {{- else if contains "ClusterIP" .Values.service.type }} + export POD_NAME=$(kubectl get pods --namespace {{ include "grafana.namespace" . }} -l "app.kubernetes.io/name={{ include "grafana.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}") + kubectl --namespace {{ include "grafana.namespace" . }} port-forward $POD_NAME 3000 + {{- end }} +{{- end }} + +3. Login with the password from step 1 and the username: {{ .Values.adminUser }} + +{{- if not .Values.persistence.enabled }} +################################################################################# +###### WARNING: Persistence is disabled!!! You will lose your data when ##### +###### the Grafana pod is terminated. ##### +################################################################################# +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_config.tpl b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_config.tpl new file mode 100644 index 00000000..19df19cd --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_config.tpl @@ -0,0 +1,171 @@ +{{/* + Generate config map data + */}} +{{- define "grafana.configData" -}} +{{ include "grafana.assertNoLeakedSecrets" . }} +{{- $files := .Files }} +{{- $root := . -}} +{{- with .Values.plugins }} +plugins: {{ join "," . }} +{{- end }} +grafana.ini: | +{{- range $elem, $elemVal := index .Values "grafana.ini" }} + {{- if not (kindIs "map" $elemVal) }} + {{- if kindIs "invalid" $elemVal }} + {{ $elem }} = + {{- else if kindIs "string" $elemVal }} + {{ $elem }} = {{ tpl $elemVal $ }} + {{- else }} + {{ $elem }} = {{ $elemVal }} + {{- end }} + {{- end }} +{{- end }} +{{- range $key, $value := index .Values "grafana.ini" }} + {{- if kindIs "map" $value }} + [{{ $key }}] + {{- range $elem, $elemVal := $value }} + {{- if kindIs "invalid" $elemVal }} + {{ $elem }} = + {{- else if kindIs "string" $elemVal }} + {{ $elem }} = {{ tpl $elemVal $ }} + {{- else }} + {{ $elem }} = {{ $elemVal }} + {{- end }} + {{- end }} + {{- end }} +{{- end }} + +{{- range $key, $value := .Values.datasources }} +{{- if not (hasKey $value "secret") }} +{{ $key }}: | + {{- tpl (toYaml $value | nindent 2) $root }} +{{- end }} +{{- end }} + +{{- range $key, $value := .Values.notifiers }} +{{- if not (hasKey $value "secret") }} +{{ $key }}: | + {{- toYaml $value | nindent 2 }} +{{- end }} +{{- end }} + +{{- range $key, $value := .Values.alerting }} +{{- if (hasKey $value "file") }} +{{ $key }}: +{{- toYaml ( $files.Get $value.file ) | nindent 2 }} +{{- else if (or (hasKey $value "secret") (hasKey $value "secretFile"))}} +{{/* will be stored inside secret generated by "configSecret.yaml"*/}} +{{- else }} +{{ $key }}: | + {{- tpl (toYaml $value | nindent 2) $root }} +{{- end }} +{{- end }} + +{{- range $key, $value := .Values.dashboardProviders }} +{{ $key }}: | + {{- toYaml $value | nindent 2 }} +{{- end }} + +{{- if .Values.dashboards }} +download_dashboards.sh: | + #!/usr/bin/env sh + set -euf + {{- if .Values.dashboardProviders }} + {{- range $key, $value := .Values.dashboardProviders }} + {{- range $value.providers }} + mkdir -p {{ .options.path }} + {{- end }} + {{- end }} + {{- end }} +{{ $dashboardProviders := .Values.dashboardProviders }} +{{- range $provider, $dashboards := .Values.dashboards }} + {{- range $key, $value := $dashboards }} + {{- if (or (hasKey $value "gnetId") (hasKey $value "url")) }} + curl -skf \ + --connect-timeout 60 \ + --max-time 60 \ + {{- if not $value.b64content }} + {{- if not $value.acceptHeader }} + -H "Accept: application/json" \ + {{- else }} + -H "Accept: {{ $value.acceptHeader }}" \ + {{- end }} + {{- if $value.token }} + -H "Authorization: token {{ $value.token }}" \ + {{- end }} + {{- if $value.bearerToken }} + -H "Authorization: Bearer {{ $value.bearerToken }}" \ + {{- end }} + {{- if $value.basic }} + -H "Authorization: Basic {{ $value.basic }}" \ + {{- end }} + {{- if $value.gitlabToken }} + -H "PRIVATE-TOKEN: {{ $value.gitlabToken }}" \ + {{- end }} + -H "Content-Type: application/json;charset=UTF-8" \ + {{- end }} + {{- $dpPath := "" -}} + {{- range $kd := (index $dashboardProviders "dashboardproviders.yaml").providers }} + {{- if eq $kd.name $provider }} + {{- $dpPath = $kd.options.path }} + {{- end }} + {{- end }} + {{- if $value.url }} + "{{ $value.url }}" \ + {{- else }} + "https://grafana.com/api/dashboards/{{ $value.gnetId }}/revisions/{{- if $value.revision -}}{{ $value.revision }}{{- else -}}1{{- end -}}/download" \ + {{- end }} + {{- if $value.datasource }} + {{- if kindIs "string" $value.datasource }} + | sed '/-- .* --/! s/"datasource":.*,/"datasource": "{{ $value.datasource }}",/g' \ + {{- end }} + {{- if kindIs "slice" $value.datasource }} + {{- range $value.datasource }} + | sed '/-- .* --/! s/${{"{"}}{{ .name }}}/{{ .value }}/g' \ + {{- end }} + {{- end }} + {{- end }} + {{- if $value.b64content }} + | base64 -d \ + {{- end }} + > "{{- if $dpPath -}}{{ $dpPath }}{{- else -}}/var/lib/grafana/dashboards/{{ $provider }}{{- end -}}/{{ $key }}.json" + {{ end }} + {{- end }} +{{- end }} +{{- end }} +{{- end -}} + +{{/* + Generate dashboard json config map data + */}} +{{- define "grafana.configDashboardProviderData" -}} +provider.yaml: |- + apiVersion: 1 + providers: + - name: '{{ .Values.sidecar.dashboards.provider.name }}' + orgId: {{ .Values.sidecar.dashboards.provider.orgid }} + {{- if not .Values.sidecar.dashboards.provider.foldersFromFilesStructure }} + folder: '{{ .Values.sidecar.dashboards.provider.folder }}' + {{- end }} + type: {{ .Values.sidecar.dashboards.provider.type }} + disableDeletion: {{ .Values.sidecar.dashboards.provider.disableDelete }} + allowUiUpdates: {{ .Values.sidecar.dashboards.provider.allowUiUpdates }} + updateIntervalSeconds: {{ .Values.sidecar.dashboards.provider.updateIntervalSeconds | default 30 }} + options: + foldersFromFilesStructure: {{ .Values.sidecar.dashboards.provider.foldersFromFilesStructure }} + path: {{ .Values.sidecar.dashboards.folder }}{{- with .Values.sidecar.dashboards.defaultFolderName }}/{{ . }}{{- end }} +{{- end -}} + +{{- define "grafana.secretsData" -}} +{{- if and (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) (not .Values.admin.existingSecret) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) }} +admin-user: {{ .Values.adminUser | b64enc | quote }} +{{- if .Values.adminPassword }} +admin-password: {{ .Values.adminPassword | b64enc | quote }} +{{- else }} +admin-password: {{ include "grafana.password" . }} +{{- end }} +{{- end }} +{{- if not .Values.ldap.existingSecret }} +ldap-toml: {{ tpl .Values.ldap.config $ | b64enc | quote }} +{{- end }} +{{- end -}} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_helpers.tpl b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_helpers.tpl new file mode 100644 index 00000000..68d2d815 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_helpers.tpl @@ -0,0 +1,305 @@ +# Rancher +{{- define "system_default_registry" -}} +{{- if .Values.global.cattle.systemDefaultRegistry -}} +{{- printf "%s/" .Values.global.cattle.systemDefaultRegistry -}} +{{- end -}} +{{- end -}} + +# Windows Support + +{{/* +Windows cluster will add default taint for linux nodes, +add below linux tolerations to workloads could be scheduled to those linux nodes +*/}} + +{{- define "linux-node-tolerations" -}} +- key: "cattle.io/os" + value: "linux" + effect: "NoSchedule" + operator: "Equal" +{{- end -}} + +{{- define "linux-node-selector" -}} +{{- if semverCompare "<1.14-0" .Capabilities.KubeVersion.GitVersion -}} +beta.kubernetes.io/os: linux +{{- else -}} +kubernetes.io/os: linux +{{- end -}} +{{- end -}} + +{{/* vim: set filetype=mustache: */}} +{{/* +Expand the name of the chart. +*/}} +{{- define "grafana.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Create a default fully qualified app name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +If release name contains chart name it will be used as a full name. +*/}} +{{- define "grafana.fullname" -}} +{{- if .Values.fullnameOverride }} +{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }} +{{- else }} +{{- $name := default .Chart.Name .Values.nameOverride }} +{{- if contains $name .Release.Name }} +{{- .Release.Name | trunc 63 | trimSuffix "-" }} +{{- else }} +{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }} +{{- end }} +{{- end }} +{{- end }} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "grafana.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Create the name of the service account +*/}} +{{- define "grafana.serviceAccountName" -}} +{{- if .Values.serviceAccount.create }} +{{- default (include "grafana.fullname" .) .Values.serviceAccount.name }} +{{- else }} +{{- default "default" .Values.serviceAccount.name }} +{{- end }} +{{- end }} + +{{- define "grafana.serviceAccountNameTest" -}} +{{- if .Values.serviceAccount.create }} +{{- default (print (include "grafana.fullname" .) "-test") .Values.serviceAccount.nameTest }} +{{- else }} +{{- default "default" .Values.serviceAccount.nameTest }} +{{- end }} +{{- end }} + +{{/* +Allow the release namespace to be overridden for multi-namespace deployments in combined charts +*/}} +{{- define "grafana.namespace" -}} +{{- if .Values.namespaceOverride }} +{{- .Values.namespaceOverride }} +{{- else }} +{{- .Release.Namespace }} +{{- end }} +{{- end }} + +{{/* +Common labels +*/}} +{{- define "grafana.labels" -}} +helm.sh/chart: {{ include "grafana.chart" . }} +{{ include "grafana.selectorLabels" . }} +{{- if or .Chart.AppVersion .Values.image.tag }} +app.kubernetes.io/version: {{ mustRegexReplaceAllLiteral "@sha.*" .Values.image.tag "" | default .Chart.AppVersion | quote }} +{{- end }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +{{- with .Values.extraLabels }} +{{ toYaml . }} +{{- end }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "grafana.selectorLabels" -}} +app.kubernetes.io/name: {{ include "grafana.name" . }} +app.kubernetes.io/instance: {{ .Release.Name }} +{{- end }} + +{{/* +Common labels +*/}} +{{- define "grafana.imageRenderer.labels" -}} +helm.sh/chart: {{ include "grafana.chart" . }} +{{ include "grafana.imageRenderer.selectorLabels" . }} +{{- if or .Chart.AppVersion .Values.image.tag }} +app.kubernetes.io/version: {{ mustRegexReplaceAllLiteral "@sha.*" .Values.image.tag "" | default .Chart.AppVersion | quote }} +{{- end }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +{{- end }} + +{{/* +Selector labels ImageRenderer +*/}} +{{- define "grafana.imageRenderer.selectorLabels" -}} +app.kubernetes.io/name: {{ include "grafana.name" . }}-image-renderer +app.kubernetes.io/instance: {{ .Release.Name }} +{{- end }} + +{{/* +Looks if there's an existing secret and reuse its password. If not it generates +new password and use it. +*/}} +{{- define "grafana.password" -}} +{{- $secret := (lookup "v1" "Secret" (include "grafana.namespace" .) (include "grafana.fullname" .) ) }} +{{- if $secret }} +{{- index $secret "data" "admin-password" }} +{{- else }} +{{- (randAlphaNum 40) | b64enc | quote }} +{{- end }} +{{- end }} + +{{/* +Return the appropriate apiVersion for rbac. +*/}} +{{- define "grafana.rbac.apiVersion" -}} +{{- if $.Capabilities.APIVersions.Has "rbac.authorization.k8s.io/v1" }} +{{- print "rbac.authorization.k8s.io/v1" }} +{{- else }} +{{- print "rbac.authorization.k8s.io/v1beta1" }} +{{- end }} +{{- end }} + +{{/* +Return the appropriate apiVersion for ingress. +*/}} +{{- define "grafana.ingress.apiVersion" -}} +{{- if and ($.Capabilities.APIVersions.Has "networking.k8s.io/v1") (semverCompare ">= 1.19-0" .Capabilities.KubeVersion.Version) }} +{{- print "networking.k8s.io/v1" }} +{{- else if $.Capabilities.APIVersions.Has "networking.k8s.io/v1beta1" }} +{{- print "networking.k8s.io/v1beta1" }} +{{- else }} +{{- print "extensions/v1beta1" }} +{{- end }} +{{- end }} + +{{/* +Return the appropriate apiVersion for Horizontal Pod Autoscaler. +*/}} +{{- define "grafana.hpa.apiVersion" -}} +{{- if .Capabilities.APIVersions.Has "autoscaling/v2" }} +{{- print "autoscaling/v2" }} +{{- else }} +{{- print "autoscaling/v2beta2" }} +{{- end }} +{{- end }} + +{{/* +Return the appropriate apiVersion for podDisruptionBudget. +*/}} +{{- define "grafana.podDisruptionBudget.apiVersion" -}} +{{- if $.Values.podDisruptionBudget.apiVersion }} +{{- print $.Values.podDisruptionBudget.apiVersion }} +{{- else if $.Capabilities.APIVersions.Has "policy/v1/PodDisruptionBudget" }} +{{- print "policy/v1" }} +{{- else }} +{{- print "policy/v1beta1" }} +{{- end }} +{{- end }} + +{{/* +Return if ingress is stable. +*/}} +{{- define "grafana.ingress.isStable" -}} +{{- eq (include "grafana.ingress.apiVersion" .) "networking.k8s.io/v1" }} +{{- end }} + +{{/* +Return if ingress supports ingressClassName. +*/}} +{{- define "grafana.ingress.supportsIngressClassName" -}} +{{- or (eq (include "grafana.ingress.isStable" .) "true") (and (eq (include "grafana.ingress.apiVersion" .) "networking.k8s.io/v1beta1") (semverCompare ">= 1.18-0" .Capabilities.KubeVersion.Version)) }} +{{- end }} + +{{/* +Return if ingress supports pathType. +*/}} +{{- define "grafana.ingress.supportsPathType" -}} +{{- or (eq (include "grafana.ingress.isStable" .) "true") (and (eq (include "grafana.ingress.apiVersion" .) "networking.k8s.io/v1beta1") (semverCompare ">= 1.18-0" .Capabilities.KubeVersion.Version)) }} +{{- end }} + +{{/* +Formats imagePullSecrets. Input is (dict "root" . "imagePullSecrets" .{specific imagePullSecrets}) +*/}} +{{- define "grafana.imagePullSecrets" -}} +{{- $root := .root }} +{{- range (concat .root.Values.global.imagePullSecrets .imagePullSecrets) }} +{{- if eq (typeOf .) "map[string]interface {}" }} +- {{ toYaml (dict "name" (tpl .name $root)) | trim }} +{{- else }} +- name: {{ tpl . $root }} +{{- end }} +{{- end }} +{{- end }} + + +{{/* + Checks whether or not the configSecret secret has to be created + */}} +{{- define "grafana.shouldCreateConfigSecret" -}} +{{- $secretFound := false -}} +{{- range $key, $value := .Values.datasources }} + {{- if hasKey $value "secret" }} + {{- $secretFound = true}} + {{- end }} +{{- end }} +{{- range $key, $value := .Values.notifiers }} + {{- if hasKey $value "secret" }} + {{- $secretFound = true}} + {{- end }} +{{- end }} +{{- range $key, $value := .Values.alerting }} + {{- if (or (hasKey $value "secret") (hasKey $value "secretFile")) }} + {{- $secretFound = true}} + {{- end }} +{{- end }} +{{- $secretFound}} +{{- end -}} + +{{/* + Checks whether the user is attempting to store secrets in plaintext + in the grafana.ini configmap +*/}} +{{/* grafana.assertNoLeakedSecrets checks for sensitive keys in values */}} +{{- define "grafana.assertNoLeakedSecrets" -}} + {{- $sensitiveKeysYaml := ` +sensitiveKeys: +- path: ["database", "password"] +- path: ["smtp", "password"] +- path: ["security", "secret_key"] +- path: ["security", "admin_password"] +- path: ["auth.basic", "password"] +- path: ["auth.ldap", "bind_password"] +- path: ["auth.google", "client_secret"] +- path: ["auth.github", "client_secret"] +- path: ["auth.gitlab", "client_secret"] +- path: ["auth.generic_oauth", "client_secret"] +- path: ["auth.okta", "client_secret"] +- path: ["auth.azuread", "client_secret"] +- path: ["auth.grafana_com", "client_secret"] +- path: ["auth.grafananet", "client_secret"] +- path: ["azure", "user_identity_client_secret"] +- path: ["unified_alerting", "ha_redis_password"] +- path: ["metrics", "basic_auth_password"] +- path: ["external_image_storage.s3", "secret_key"] +- path: ["external_image_storage.webdav", "password"] +- path: ["external_image_storage.azure_blob", "account_key"] +` | fromYaml -}} + {{- if $.Values.assertNoLeakedSecrets -}} + {{- $grafanaIni := index .Values "grafana.ini" -}} + {{- range $_, $secret := $sensitiveKeysYaml.sensitiveKeys -}} + {{- $currentMap := $grafanaIni -}} + {{- $shouldContinue := true -}} + {{- range $index, $elem := $secret.path -}} + {{- if and $shouldContinue (hasKey $currentMap $elem) -}} + {{- if eq (len $secret.path) (add1 $index) -}} + {{- if not (regexMatch "\\$(?:__(?:env|file|vault))?{[^}]+}" (index $currentMap $elem)) -}} + {{- fail (printf "Sensitive key '%s' should not be defined explicitly in values. Use variable expansion instead. You can disable this client-side validation by changing the value of assertNoLeakedSecrets." (join "." $secret.path)) -}} + {{- end -}} + {{- else -}} + {{- $currentMap = index $currentMap $elem -}} + {{- end -}} + {{- else -}} + {{- $shouldContinue = false -}} + {{- end -}} + {{- end -}} + {{- end -}} + {{- end -}} +{{- end -}} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_pod.tpl b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_pod.tpl new file mode 100644 index 00000000..3e72d4d0 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/_pod.tpl @@ -0,0 +1,1290 @@ +{{- define "grafana.pod" -}} +{{- $sts := list "sts" "StatefulSet" "statefulset" -}} +{{- $root := . -}} +{{- with .Values.schedulerName }} +schedulerName: "{{ . }}" +{{- end }} +serviceAccountName: {{ include "grafana.serviceAccountName" . }} +automountServiceAccountToken: {{ .Values.automountServiceAccountToken }} +{{- with .Values.securityContext }} +securityContext: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with .Values.hostAliases }} +hostAliases: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- if .Values.dnsPolicy }} +dnsPolicy: {{ .Values.dnsPolicy }} +{{- end }} +{{- with .Values.dnsConfig }} +dnsConfig: + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with .Values.priorityClassName }} +priorityClassName: {{ . }} +{{- end }} +{{- if ( or .Values.persistence.enabled .Values.dashboards .Values.extraInitContainers (and .Values.sidecar.alerts.enabled .Values.sidecar.alerts.initAlerts) (and .Values.sidecar.datasources.enabled .Values.sidecar.datasources.initDatasources) (and .Values.sidecar.notifiers.enabled .Values.sidecar.notifiers.initNotifiers)) }} +initContainers: +{{- end }} +{{- if ( and .Values.persistence.enabled .Values.initChownData.enabled ) }} + - name: init-chown-data + {{- $registry := include "system_default_registry" . | default .Values.initChownData.image.registry -}} + {{- if .Values.initChownData.image.sha }} + image: "{{ $registry }}{{ .Values.initChownData.image.repository }}:{{ .Values.initChownData.image.tag }}@sha256:{{ .Values.initChownData.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.initChownData.image.repository }}:{{ .Values.initChownData.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.initChownData.image.pullPolicy }} + {{- with .Values.initChownData.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + command: + - chown + - -R + - {{ .Values.securityContext.runAsUser }}:{{ .Values.securityContext.runAsGroup }} + - /var/lib/grafana + {{- with .Values.initChownData.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: storage + mountPath: "/var/lib/grafana" + {{- with .Values.persistence.subPath }} + subPath: {{ tpl . $root }} + {{- end }} +{{- end }} +{{- if .Values.dashboards }} + - name: download-dashboards + {{- $registry := include "system_default_registry" . | default .Values.downloadDashboardsImage.registry -}} + {{- if .Values.downloadDashboardsImage.sha }} + image: "{{ $registry }}{{ .Values.downloadDashboardsImage.repository }}:{{ .Values.downloadDashboardsImage.tag }}@sha256:{{ .Values.downloadDashboardsImage.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.downloadDashboardsImage.repository }}:{{ .Values.downloadDashboardsImage.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.downloadDashboardsImage.pullPolicy }} + command: ["/bin/sh"] + args: [ "-c", "mkdir -p /var/lib/grafana/dashboards/default && /bin/sh -x /etc/grafana/download_dashboards.sh" ] + {{- with .Values.downloadDashboards.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + env: + {{- range $key, $value := .Values.downloadDashboards.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- range $key, $value := .Values.downloadDashboards.envValueFrom }} + - name: {{ $key | quote }} + valueFrom: + {{- tpl (toYaml $value) $ | nindent 10 }} + {{- end }} + {{- with .Values.downloadDashboards.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.downloadDashboards.envFromSecret }} + envFrom: + - secretRef: + name: {{ tpl . $root }} + {{- end }} + volumeMounts: + - name: config + mountPath: "/etc/grafana/download_dashboards.sh" + subPath: download_dashboards.sh + - name: storage + mountPath: "/var/lib/grafana" + {{- with .Values.persistence.subPath }} + subPath: {{ tpl . $root }} + {{- end }} + {{- range .Values.extraSecretMounts }} + - name: {{ .name }} + mountPath: {{ .mountPath }} + readOnly: {{ .readOnly }} + {{- end }} +{{- end }} +{{- if and .Values.sidecar.alerts.enabled .Values.sidecar.alerts.initAlerts }} + - name: {{ include "grafana.name" . }}-init-sc-alerts + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.alerts.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.alerts.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: "LIST" + - name: LABEL + value: "{{ .Values.sidecar.alerts.label }}" + {{- with .Values.sidecar.alerts.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.alerts.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.alerts.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/alerting" + - name: RESOURCE + value: {{ quote .Values.sidecar.alerts.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.alerts.searchNamespace }} + - name: NAMESPACE + value: {{ . | join "," | quote }} + {{- end }} + {{- with .Values.sidecar.alerts.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: {{ quote . }} + {{- end }} + {{- with .Values.sidecar.alerts.script }} + - name: SCRIPT + value: {{ quote . }} + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-alerts-volume + mountPath: "/etc/grafana/provisioning/alerting" + {{- with .Values.sidecar.alerts.extraMounts }} + {{- toYaml . | trim | nindent 6 }} + {{- end }} +{{- end }} +{{- if and .Values.sidecar.datasources.enabled .Values.sidecar.datasources.initDatasources }} + - name: {{ include "grafana.name" . }}-init-sc-datasources + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.datasources.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.datasources.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: "LIST" + - name: LABEL + value: "{{ .Values.sidecar.datasources.label }}" + {{- with .Values.sidecar.datasources.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.datasources.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.datasources.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/datasources" + - name: RESOURCE + value: {{ quote .Values.sidecar.datasources.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + {{- if .Values.sidecar.datasources.searchNamespace }} + - name: NAMESPACE + value: "{{ tpl (.Values.sidecar.datasources.searchNamespace | join ",") . }}" + {{- end }} + {{- with .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-datasources-volume + mountPath: "/etc/grafana/provisioning/datasources" +{{- end }} +{{- if and .Values.sidecar.notifiers.enabled .Values.sidecar.notifiers.initNotifiers }} + - name: {{ include "grafana.name" . }}-init-sc-notifiers + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.notifiers.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.notifiers.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: LIST + - name: LABEL + value: "{{ .Values.sidecar.notifiers.label }}" + {{- with .Values.sidecar.notifiers.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.notifiers.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.notifiers.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/notifiers" + - name: RESOURCE + value: {{ quote .Values.sidecar.notifiers.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.notifiers.searchNamespace }} + - name: NAMESPACE + value: "{{ tpl (. | join ",") $root }}" + {{- end }} + {{- with .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-notifiers-volume + mountPath: "/etc/grafana/provisioning/notifiers" +{{- end}} +{{- with .Values.extraInitContainers }} + {{- tpl (toYaml .) $root | nindent 2 }} +{{- end }} +{{- if or .Values.image.pullSecrets .Values.global.imagePullSecrets }} +imagePullSecrets: + {{- include "grafana.imagePullSecrets" (dict "root" $root "imagePullSecrets" .Values.image.pullSecrets) | nindent 2 }} +{{- end }} +{{- if not .Values.enableKubeBackwardCompatibility }} +enableServiceLinks: {{ .Values.enableServiceLinks }} +{{- end }} +containers: +{{- if and .Values.sidecar.alerts.enabled (not .Values.sidecar.alerts.initAlerts) }} + - name: {{ include "grafana.name" . }}-sc-alerts + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.alerts.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.alerts.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: {{ .Values.sidecar.alerts.watchMethod }} + - name: LABEL + value: "{{ .Values.sidecar.alerts.label }}" + {{- with .Values.sidecar.alerts.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.alerts.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.alerts.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/alerting" + - name: RESOURCE + value: {{ quote .Values.sidecar.alerts.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.alerts.searchNamespace }} + - name: NAMESPACE + value: {{ . | join "," | quote }} + {{- end }} + {{- with .Values.sidecar.alerts.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: {{ quote . }} + {{- end }} + {{- with .Values.sidecar.alerts.script }} + - name: SCRIPT + value: {{ quote . }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_USERNAME + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + {{- if not .Values.sidecar.alerts.skipReload }} + - name: REQ_URL + value: {{ .Values.sidecar.alerts.reloadURL }} + - name: REQ_METHOD + value: POST + {{- end }} + {{- if .Values.sidecar.alerts.watchServerTimeout }} + {{- if ne .Values.sidecar.alerts.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.alerts.watchServerTimeout with .Values.sidecar.alerts.watchMethod %s" .Values.sidecar.alerts.watchMethod) }} + {{- end }} + - name: WATCH_SERVER_TIMEOUT + value: "{{ .Values.sidecar.alerts.watchServerTimeout }}" + {{- end }} + {{- if .Values.sidecar.alerts.watchClientTimeout }} + {{- if ne .Values.sidecar.alerts.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.alerts.watchClientTimeout with .Values.sidecar.alerts.watchMethod %s" .Values.sidecar.alerts.watchMethod) }} + {{- end }} + - name: WATCH_CLIENT_TIMEOUT + value: "{{ .Values.sidecar.alerts.watchClientTimeout }}" + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-alerts-volume + mountPath: "/etc/grafana/provisioning/alerting" + {{- with .Values.sidecar.alerts.extraMounts }} + {{- toYaml . | trim | nindent 6 }} + {{- end }} +{{- end}} +{{- if .Values.sidecar.dashboards.enabled }} + - name: {{ include "grafana.name" . }}-sc-dashboard + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.dashboards.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- range $key, $value := .Values.sidecar.datasources.envValueFrom }} + - name: {{ $key | quote }} + valueFrom: + {{- tpl (toYaml $value) $ | nindent 10 }} + {{- end }} + {{- if .Values.sidecar.dashboards.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: {{ .Values.sidecar.dashboards.watchMethod }} + - name: LABEL + value: "{{ .Values.sidecar.dashboards.label }}" + {{- with .Values.sidecar.dashboards.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.dashboards.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.dashboards.logLevel }} + {{- end }} + - name: FOLDER + value: "{{ .Values.sidecar.dashboards.folder }}{{- with .Values.sidecar.dashboards.defaultFolderName }}/{{ . }}{{- end }}" + - name: RESOURCE + value: {{ quote .Values.sidecar.dashboards.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + - name: NAMESPACE + value: "{{ template "project-prometheus-stack.projectNamespaceList" . }}" + {{- with .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.dashboards.folderAnnotation }} + - name: FOLDER_ANNOTATION + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.dashboards.script }} + - name: SCRIPT + value: "{{ . }}" + {{- end }} + {{- if not .Values.sidecar.dashboards.skipReload }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_USERNAME + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + - name: REQ_URL + value: {{ .Values.sidecar.dashboards.reloadURL }} + - name: REQ_METHOD + value: POST + {{- end }} + {{- if .Values.sidecar.dashboards.watchServerTimeout }} + {{- if ne .Values.sidecar.dashboards.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.dashboards.watchServerTimeout with .Values.sidecar.dashboards.watchMethod %s" .Values.sidecar.dashboards.watchMethod) }} + {{- end }} + - name: WATCH_SERVER_TIMEOUT + value: "{{ .Values.sidecar.dashboards.watchServerTimeout }}" + {{- end }} + {{- if .Values.sidecar.dashboards.watchClientTimeout }} + {{- if ne .Values.sidecar.dashboards.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.dashboards.watchClientTimeout with .Values.sidecar.dashboards.watchMethod %s" .Values.sidecar.dashboards.watchMethod) }} + {{- end }} + - name: WATCH_CLIENT_TIMEOUT + value: {{ .Values.sidecar.dashboards.watchClientTimeout | quote }} + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-dashboard-volume + mountPath: {{ .Values.sidecar.dashboards.folder | quote }} + {{- with .Values.sidecar.dashboards.extraMounts }} + {{- toYaml . | trim | nindent 6 }} + {{- end }} +{{- end}} +{{- if and .Values.sidecar.datasources.enabled (not .Values.sidecar.datasources.initDatasources) }} + - name: {{ include "grafana.name" . }}-sc-datasources + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.datasources.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.datasources.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: {{ .Values.sidecar.datasources.watchMethod }} + - name: LABEL + value: "{{ .Values.sidecar.datasources.label }}" + {{- with .Values.sidecar.datasources.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.datasources.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.datasources.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/datasources" + - name: RESOURCE + value: {{ quote .Values.sidecar.datasources.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + - name: NAMESPACE + value: "{{ template "project-prometheus-stack.projectNamespaceList" . }}" + {{- if .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ .Values.sidecar.skipTlsVerify }}" + {{- end }} + {{- if .Values.sidecar.datasources.script }} + - name: SCRIPT + value: "{{ .Values.sidecar.datasources.script }}" + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_USERNAME + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + {{- if not .Values.sidecar.datasources.skipReload }} + - name: REQ_URL + value: {{ .Values.sidecar.datasources.reloadURL }} + - name: REQ_METHOD + value: POST + {{- end }} + {{- if .Values.sidecar.datasources.watchServerTimeout }} + {{- if ne .Values.sidecar.datasources.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.datasources.watchServerTimeout with .Values.sidecar.datasources.watchMethod %s" .Values.sidecar.datasources.watchMethod) }} + {{- end }} + - name: WATCH_SERVER_TIMEOUT + value: "{{ .Values.sidecar.datasources.watchServerTimeout }}" + {{- end }} + {{- if .Values.sidecar.datasources.watchClientTimeout }} + {{- if ne .Values.sidecar.datasources.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.datasources.watchClientTimeout with .Values.sidecar.datasources.watchMethod %s" .Values.sidecar.datasources.watchMethod) }} + {{- end }} + - name: WATCH_CLIENT_TIMEOUT + value: "{{ .Values.sidecar.datasources.watchClientTimeout }}" + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-datasources-volume + mountPath: "/etc/grafana/provisioning/datasources" +{{- end}} +{{- if .Values.sidecar.notifiers.enabled }} + - name: {{ include "grafana.name" . }}-sc-notifiers + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.notifiers.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.notifiers.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: {{ .Values.sidecar.notifiers.watchMethod }} + - name: LABEL + value: "{{ .Values.sidecar.notifiers.label }}" + {{- with .Values.sidecar.notifiers.labelValue }} + - name: LABEL_VALUE + value: {{ quote . }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.notifiers.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.notifiers.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/notifiers" + - name: RESOURCE + value: {{ quote .Values.sidecar.notifiers.resource }} + {{- if .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ .Values.sidecar.enableUniqueFilenames }}" + {{- end }} + {{- with .Values.sidecar.notifiers.searchNamespace }} + - name: NAMESPACE + value: "{{ tpl (. | join ",") $root }}" + {{- end }} + {{- with .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ . }}" + {{- end }} + {{- if .Values.sidecar.notifiers.script }} + - name: SCRIPT + value: "{{ .Values.sidecar.notifiers.script }}" + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_USERNAME + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + {{- if not .Values.sidecar.notifiers.skipReload }} + - name: REQ_URL + value: {{ .Values.sidecar.notifiers.reloadURL }} + - name: REQ_METHOD + value: POST + {{- end }} + {{- if .Values.sidecar.notifiers.watchServerTimeout }} + {{- if ne .Values.sidecar.notifiers.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.notifiers.watchServerTimeout with .Values.sidecar.notifiers.watchMethod %s" .Values.sidecar.notifiers.watchMethod) }} + {{- end }} + - name: WATCH_SERVER_TIMEOUT + value: "{{ .Values.sidecar.notifiers.watchServerTimeout }}" + {{- end }} + {{- if .Values.sidecar.notifiers.watchClientTimeout }} + {{- if ne .Values.sidecar.notifiers.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.notifiers.watchClientTimeout with .Values.sidecar.notifiers.watchMethod %s" .Values.sidecar.notifiers.watchMethod) }} + {{- end }} + - name: WATCH_CLIENT_TIMEOUT + value: "{{ .Values.sidecar.notifiers.watchClientTimeout }}" + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-notifiers-volume + mountPath: "/etc/grafana/provisioning/notifiers" +{{- end}} +{{- if .Values.sidecar.plugins.enabled }} + - name: {{ include "grafana.name" . }}-sc-plugins + {{- $registry := include "system_default_registry" . | default .Values.sidecar.image.registry -}} + {{- if .Values.sidecar.image.sha }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}@sha256:{{ .Values.sidecar.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.sidecar.image.repository }}:{{ .Values.sidecar.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.sidecar.imagePullPolicy }} + env: + {{- range $key, $value := .Values.sidecar.plugins.env }} + - name: "{{ $key }}" + value: "{{ $value }}" + {{- end }} + {{- if .Values.sidecar.plugins.ignoreAlreadyProcessed }} + - name: IGNORE_ALREADY_PROCESSED + value: "true" + {{- end }} + - name: METHOD + value: {{ .Values.sidecar.plugins.watchMethod }} + - name: LABEL + value: "{{ .Values.sidecar.plugins.label }}" + {{- if .Values.sidecar.plugins.labelValue }} + - name: LABEL_VALUE + value: {{ quote .Values.sidecar.plugins.labelValue }} + {{- end }} + {{- if or .Values.sidecar.logLevel .Values.sidecar.plugins.logLevel }} + - name: LOG_LEVEL + value: {{ default .Values.sidecar.logLevel .Values.sidecar.plugins.logLevel }} + {{- end }} + - name: FOLDER + value: "/etc/grafana/provisioning/plugins" + - name: RESOURCE + value: {{ quote .Values.sidecar.plugins.resource }} + {{- with .Values.sidecar.enableUniqueFilenames }} + - name: UNIQUE_FILENAMES + value: "{{ . }}" + {{- end }} + - name: NAMESPACE + value: "{{ template "project-prometheus-stack.projectNamespaceList" . }}" + {{- with .Values.sidecar.plugins.script }} + - name: SCRIPT + value: "{{ . }}" + {{- end }} + {{- with .Values.sidecar.skipTlsVerify }} + - name: SKIP_TLS_VERIFY + value: "{{ . }}" + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_USERNAME + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: REQ_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + {{- if not .Values.sidecar.plugins.skipReload }} + - name: REQ_URL + value: {{ .Values.sidecar.plugins.reloadURL }} + - name: REQ_METHOD + value: POST + {{- end }} + {{- if .Values.sidecar.plugins.watchServerTimeout }} + {{- if ne .Values.sidecar.plugins.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.plugins.watchServerTimeout with .Values.sidecar.plugins.watchMethod %s" .Values.sidecar.plugins.watchMethod) }} + {{- end }} + - name: WATCH_SERVER_TIMEOUT + value: "{{ .Values.sidecar.plugins.watchServerTimeout }}" + {{- end }} + {{- if .Values.sidecar.plugins.watchClientTimeout }} + {{- if ne .Values.sidecar.plugins.watchMethod "WATCH" }} + {{- fail (printf "Cannot use .Values.sidecar.plugins.watchClientTimeout with .Values.sidecar.plugins.watchMethod %s" .Values.sidecar.plugins.watchMethod) }} + {{- end }} + - name: WATCH_CLIENT_TIMEOUT + value: "{{ .Values.sidecar.plugins.watchClientTimeout }}" + {{- end }} + {{- with .Values.sidecar.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.sidecar.securityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: sc-plugins-volume + mountPath: "/etc/grafana/provisioning/plugins" +{{- end}} + - name: {{ .Chart.Name }} + {{- $registry := include "system_default_registry" . | default .Values.image.registry -}} + {{- if .Values.image.sha }} + image: "{{ $registry }}{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}@sha256:{{ .Values.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" + {{- end }} + imagePullPolicy: {{ .Values.image.pullPolicy }} + {{- if .Values.command }} + command: + {{- range .Values.command }} + - {{ . | quote }} + {{- end }} + {{- end }} + {{- if .Values.args }} + args: + {{- range .Values.args }} + - {{ . | quote }} + {{- end }} + {{- end }} + {{- with .Values.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 6 }} + {{- end }} + volumeMounts: + - name: config + mountPath: "/etc/grafana/grafana.ini" + subPath: grafana.ini + {{- if .Values.ldap.enabled }} + - name: ldap + mountPath: "/etc/grafana/ldap.toml" + subPath: ldap.toml + {{- end }} + {{- range .Values.extraConfigmapMounts }} + - name: {{ tpl .name $root }} + mountPath: {{ tpl .mountPath $root }} + subPath: {{ tpl (.subPath | default "") $root }} + readOnly: {{ .readOnly }} + {{- end }} + - name: storage + mountPath: "/var/lib/grafana" + {{- with .Values.persistence.subPath }} + subPath: {{ tpl . $root }} + {{- end }} + {{- with .Values.dashboards }} + {{- range $provider, $dashboards := . }} + {{- range $key, $value := $dashboards }} + {{- if (or (hasKey $value "json") (hasKey $value "file")) }} + - name: dashboards-{{ $provider }} + mountPath: "/var/lib/grafana/dashboards/{{ $provider }}/{{ $key }}.json" + subPath: "{{ $key }}.json" + {{- end }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Values.dashboardsConfigMaps }} + {{- range (keys . | sortAlpha) }} + - name: dashboards-{{ . }} + mountPath: "/var/lib/grafana/dashboards/{{ . }}" + {{- end }} + {{- end }} + {{- with .Values.datasources }} + {{- $datasources := . }} + {{- range (keys . | sortAlpha) }} + {{- if (or (hasKey (index $datasources .) "secret")) }} {{/*check if current datasource should be handeled as secret */}} + - name: config-secret + mountPath: "/etc/grafana/provisioning/datasources/{{ . }}" + subPath: {{ . | quote }} + {{- else }} + - name: config + mountPath: "/etc/grafana/provisioning/datasources/{{ . }}" + subPath: {{ . | quote }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Values.notifiers }} + {{- $notifiers := . }} + {{- range (keys . | sortAlpha) }} + {{- if (or (hasKey (index $notifiers .) "secret")) }} {{/*check if current notifier should be handeled as secret */}} + - name: config-secret + mountPath: "/etc/grafana/provisioning/notifiers/{{ . }}" + subPath: {{ . | quote }} + {{- else }} + - name: config + mountPath: "/etc/grafana/provisioning/notifiers/{{ . }}" + subPath: {{ . | quote }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Values.alerting }} + {{- $alertingmap := .}} + {{- range (keys . | sortAlpha) }} + {{- if (or (hasKey (index $.Values.alerting .) "secret") (hasKey (index $.Values.alerting .) "secretFile")) }} {{/*check if current alerting entry should be handeled as secret */}} + - name: config-secret + mountPath: "/etc/grafana/provisioning/alerting/{{ . }}" + subPath: {{ . | quote }} + {{- else }} + - name: config + mountPath: "/etc/grafana/provisioning/alerting/{{ . }}" + subPath: {{ . | quote }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Values.dashboardProviders }} + {{- range (keys . | sortAlpha) }} + - name: config + mountPath: "/etc/grafana/provisioning/dashboards/{{ . }}" + subPath: {{ . | quote }} + {{- end }} + {{- end }} + {{- with .Values.sidecar.alerts.enabled }} + - name: sc-alerts-volume + mountPath: "/etc/grafana/provisioning/alerting" + {{- end}} + {{- if .Values.sidecar.dashboards.enabled }} + - name: sc-dashboard-volume + mountPath: {{ .Values.sidecar.dashboards.folder | quote }} + {{- if .Values.sidecar.dashboards.SCProvider }} + - name: sc-dashboard-provider + mountPath: "/etc/grafana/provisioning/dashboards/sc-dashboardproviders.yaml" + subPath: provider.yaml + {{- end}} + {{- end}} + {{- if .Values.sidecar.datasources.enabled }} + - name: sc-datasources-volume + mountPath: "/etc/grafana/provisioning/datasources" + {{- end}} + {{- if .Values.sidecar.plugins.enabled }} + - name: sc-plugins-volume + mountPath: "/etc/grafana/provisioning/plugins" + {{- end}} + {{- if .Values.sidecar.notifiers.enabled }} + - name: sc-notifiers-volume + mountPath: "/etc/grafana/provisioning/notifiers" + {{- end}} + {{- range .Values.extraSecretMounts }} + - name: {{ .name }} + mountPath: {{ .mountPath }} + readOnly: {{ .readOnly }} + subPath: {{ .subPath | default "" }} + {{- end }} + {{- range .Values.extraVolumeMounts }} + - name: {{ .name }} + mountPath: {{ .mountPath }} + subPath: {{ .subPath | default "" }} + readOnly: {{ .readOnly }} + {{- end }} + {{- range .Values.extraEmptyDirMounts }} + - name: {{ .name }} + mountPath: {{ .mountPath }} + {{- end }} + ports: + - name: {{ .Values.podPortName }} + containerPort: {{ .Values.service.targetPort }} + protocol: TCP + - name: {{ .Values.gossipPortName }}-tcp + containerPort: 9094 + protocol: TCP + - name: {{ .Values.gossipPortName }}-udp + containerPort: 9094 + protocol: UDP + env: + - name: POD_IP + valueFrom: + fieldRef: + fieldPath: status.podIP + {{- if and (not .Values.env.GF_SECURITY_ADMIN_USER) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: GF_SECURITY_ADMIN_USER + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.userKey | default "admin-user" }} + {{- end }} + {{- if and (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + - name: GF_SECURITY_ADMIN_PASSWORD + valueFrom: + secretKeyRef: + name: {{ (tpl .Values.admin.existingSecret .) | default (include "grafana.fullname" .) }} + key: {{ .Values.admin.passwordKey | default "admin-password" }} + {{- end }} + {{- if .Values.plugins }} + - name: GF_INSTALL_PLUGINS + valueFrom: + configMapKeyRef: + name: {{ include "grafana.fullname" . }} + key: plugins + {{- end }} + {{- if .Values.smtp.existingSecret }} + - name: GF_SMTP_USER + valueFrom: + secretKeyRef: + name: {{ .Values.smtp.existingSecret }} + key: {{ .Values.smtp.userKey | default "user" }} + - name: GF_SMTP_PASSWORD + valueFrom: + secretKeyRef: + name: {{ .Values.smtp.existingSecret }} + key: {{ .Values.smtp.passwordKey | default "password" }} + {{- end }} + {{- if .Values.imageRenderer.enabled }} + - name: GF_RENDERING_SERVER_URL + value: http://{{ include "grafana.fullname" . }}-image-renderer.{{ include "grafana.namespace" . }}:{{ .Values.imageRenderer.service.port }}/render + - name: GF_RENDERING_CALLBACK_URL + value: {{ .Values.imageRenderer.grafanaProtocol }}://{{ include "grafana.fullname" . }}.{{ include "grafana.namespace" . }}:{{ .Values.service.port }}/{{ .Values.imageRenderer.grafanaSubPath }} + {{- end }} + - name: GF_PATHS_DATA + value: {{ (get .Values "grafana.ini").paths.data }} + - name: GF_PATHS_LOGS + value: {{ (get .Values "grafana.ini").paths.logs }} + - name: GF_PATHS_PLUGINS + value: {{ (get .Values "grafana.ini").paths.plugins }} + - name: GF_PATHS_PROVISIONING + value: {{ (get .Values "grafana.ini").paths.provisioning }} + {{- range $key, $value := .Values.envValueFrom }} + - name: {{ $key | quote }} + valueFrom: + {{- tpl (toYaml $value) $ | nindent 10 }} + {{- end }} + {{- range $key, $value := .Values.env }} + - name: "{{ tpl $key $ }}" + value: "{{ tpl (print $value) $ }}" + {{- end }} + {{- if or .Values.envFromSecret (or .Values.envRenderSecret .Values.envFromSecrets) .Values.envFromConfigMaps }} + envFrom: + {{- if .Values.envFromSecret }} + - secretRef: + name: {{ tpl .Values.envFromSecret . }} + {{- end }} + {{- if .Values.envRenderSecret }} + - secretRef: + name: {{ include "grafana.fullname" . }}-env + {{- end }} + {{- range .Values.envFromSecrets }} + - secretRef: + name: {{ tpl .name $ }} + optional: {{ .optional | default false }} + {{- if .prefix }} + prefix: {{ tpl .prefix $ }} + {{- end }} + {{- end }} + {{- range .Values.envFromConfigMaps }} + - configMapRef: + name: {{ tpl .name $ }} + optional: {{ .optional | default false }} + {{- if .prefix }} + prefix: {{ tpl .prefix $ }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Values.livenessProbe }} + livenessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.readinessProbe }} + readinessProbe: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.lifecycleHooks }} + lifecycle: + {{- tpl (toYaml .) $root | nindent 6 }} + {{- end }} + {{- with .Values.resources }} + resources: + {{- toYaml . | nindent 6 }} + {{- end }} +{{- with .Values.extraContainers }} + {{- tpl . $ | nindent 2 }} +{{- end }} +nodeSelector: {{ include "linux-node-selector" . | nindent 2 }} +{{- with .Values.nodeSelector }} + {{- toYaml . | nindent 2 }} +{{- end }} +{{- with .Values.affinity }} +affinity: + {{- tpl (toYaml .) $root | nindent 2 }} +{{- end }} +{{- with .Values.topologySpreadConstraints }} +topologySpreadConstraints: + {{- toYaml . | nindent 2 }} +{{- end }} +tolerations: {{ include "linux-node-tolerations" . | nindent 2 }} +{{- with .Values.tolerations }} + {{- toYaml . | nindent 2 }} +{{- end }} +volumes: + - name: config + configMap: + name: {{ include "grafana.fullname" . }} + {{- $createConfigSecret := eq (include "grafana.shouldCreateConfigSecret" .) "true" -}} + {{- if and .Values.createConfigmap $createConfigSecret }} + - name: config-secret + secret: + secretName: {{ include "grafana.fullname" . }}-config-secret + {{- end }} + {{- range .Values.extraConfigmapMounts }} + - name: {{ tpl .name $root }} + configMap: + name: {{ tpl .configMap $root }} + {{- with .items }} + items: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} + {{- if .Values.dashboards }} + {{- range (keys .Values.dashboards | sortAlpha) }} + - name: dashboards-{{ . }} + configMap: + name: {{ include "grafana.fullname" $ }}-dashboards-{{ . }} + {{- end }} + {{- end }} + {{- if .Values.dashboardsConfigMaps }} + {{- range $provider, $name := .Values.dashboardsConfigMaps }} + - name: dashboards-{{ $provider }} + configMap: + name: {{ tpl $name $root }} + {{- end }} + {{- end }} + {{- if .Values.ldap.enabled }} + - name: ldap + secret: + {{- if .Values.ldap.existingSecret }} + secretName: {{ .Values.ldap.existingSecret }} + {{- else }} + secretName: {{ include "grafana.fullname" . }} + {{- end }} + items: + - key: ldap-toml + path: ldap.toml + {{- end }} + {{- if and .Values.persistence.enabled (eq .Values.persistence.type "pvc") }} + - name: storage + persistentVolumeClaim: + claimName: {{ tpl (.Values.persistence.existingClaim | default (include "grafana.fullname" .)) . }} + {{- else if and .Values.persistence.enabled (has .Values.persistence.type $sts) }} + {{/* nothing */}} + {{- else }} + - name: storage + {{- if .Values.persistence.inMemory.enabled }} + emptyDir: + medium: Memory + {{- with .Values.persistence.inMemory.sizeLimit }} + sizeLimit: {{ . }} + {{- end }} + {{- else }} + emptyDir: {} + {{- end }} + {{- end }} + {{- if .Values.sidecar.alerts.enabled }} + - name: sc-alerts-volume + emptyDir: + {{- with .Values.sidecar.alerts.sizeLimit }} + sizeLimit: {{ . }} + {{- else }} + {} + {{- end }} + {{- end }} + {{- if .Values.sidecar.dashboards.enabled }} + - name: sc-dashboard-volume + emptyDir: + {{- with .Values.sidecar.dashboards.sizeLimit }} + sizeLimit: {{ . }} + {{- else }} + {} + {{- end }} + {{- if .Values.sidecar.dashboards.SCProvider }} + - name: sc-dashboard-provider + configMap: + name: {{ include "grafana.fullname" . }}-config-dashboards + {{- end }} + {{- end }} + {{- if .Values.sidecar.datasources.enabled }} + - name: sc-datasources-volume + emptyDir: + {{- with .Values.sidecar.datasources.sizeLimit }} + sizeLimit: {{ . }} + {{- else }} + {} + {{- end }} + {{- end }} + {{- if .Values.sidecar.plugins.enabled }} + - name: sc-plugins-volume + emptyDir: + {{- with .Values.sidecar.plugins.sizeLimit }} + sizeLimit: {{ . }} + {{- else }} + {} + {{- end }} + {{- end }} + {{- if .Values.sidecar.notifiers.enabled }} + - name: sc-notifiers-volume + emptyDir: + {{- with .Values.sidecar.notifiers.sizeLimit }} + sizeLimit: {{ . }} + {{- else }} + {} + {{- end }} + {{- end }} + {{- range .Values.extraSecretMounts }} + {{- if .secretName }} + - name: {{ .name }} + secret: + secretName: {{ .secretName }} + defaultMode: {{ .defaultMode }} + {{- with .items }} + items: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- else if .projected }} + - name: {{ .name }} + projected: + {{- toYaml .projected | nindent 6 }} + {{- else if .csi }} + - name: {{ .name }} + csi: + {{- toYaml .csi | nindent 6 }} + {{- end }} + {{- end }} + {{- range .Values.extraVolumes }} + - name: {{ .name }} + {{- if .existingClaim }} + persistentVolumeClaim: + claimName: {{ .existingClaim }} + {{- else if .hostPath }} + hostPath: + {{ toYaml .hostPath | nindent 6 }} + {{- else if .csi }} + csi: + {{- toYaml .csi | nindent 6 }} + {{- else if .configMap }} + configMap: + {{- toYaml .configMap | nindent 6 }} + {{- else if .emptyDir }} + emptyDir: + {{- toYaml .emptyDir | nindent 6 }} + {{- else }} + emptyDir: {} + {{- end }} + {{- end }} + {{- range .Values.extraEmptyDirMounts }} + - name: {{ .name }} + emptyDir: {} + {{- end }} + {{- with .Values.extraContainerVolumes }} + {{- tpl (toYaml .) $root | nindent 2 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrole.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrole.yaml new file mode 100644 index 00000000..3af4b62b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrole.yaml @@ -0,0 +1,25 @@ +{{- if and .Values.rbac.create (or (not .Values.rbac.namespaced) .Values.rbac.extraClusterRoleRules) (not .Values.rbac.useExistingClusterRole) }} +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} + name: {{ include "grafana.fullname" . }}-clusterrole +{{- if or .Values.sidecar.dashboards.enabled .Values.rbac.extraClusterRoleRules .Values.sidecar.datasources.enabled .Values.sidecar.plugins.enabled .Values.sidecar.alerts.enabled }} +rules: + {{- if or .Values.sidecar.dashboards.enabled .Values.sidecar.datasources.enabled .Values.sidecar.plugins.enabled .Values.sidecar.alerts.enabled }} + - apiGroups: [""] # "" indicates the core API group + resources: ["configmaps", "secrets"] + verbs: ["get", "watch", "list"] + {{- end}} + {{- with .Values.rbac.extraClusterRoleRules }} + {{- toYaml . | nindent 2 }} + {{- end}} +{{- else }} +rules: [] +{{- end}} +{{- end}} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrolebinding.yaml new file mode 100644 index 00000000..bda9431a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/clusterrolebinding.yaml @@ -0,0 +1,24 @@ +{{- if and .Values.rbac.create (or (not .Values.rbac.namespaced) .Values.rbac.extraClusterRoleRules) }} +kind: ClusterRoleBinding +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: {{ include "grafana.fullname" . }}-clusterrolebinding + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +subjects: + - kind: ServiceAccount + name: {{ include "grafana.serviceAccountName" . }} + namespace: {{ include "grafana.namespace" . }} +roleRef: + kind: ClusterRole + {{- if .Values.rbac.useExistingClusterRole }} + name: {{ .Values.rbac.useExistingClusterRole }} + {{- else }} + name: {{ include "grafana.fullname" . }}-clusterrole + {{- end }} + apiGroup: rbac.authorization.k8s.io +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configSecret.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configSecret.yaml new file mode 100644 index 00000000..55574b9b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configSecret.yaml @@ -0,0 +1,43 @@ +{{- $createConfigSecret := eq (include "grafana.shouldCreateConfigSecret" .) "true" -}} +{{- if and .Values.createConfigmap $createConfigSecret }} +{{- $files := .Files }} +{{- $root := . -}} +apiVersion: v1 +kind: Secret +metadata: + name: "{{ include "grafana.fullname" . }}-config-secret" + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +data: +{{- range $key, $value := .Values.alerting }} + {{- if (hasKey $value "secretFile") }} + {{- $key | nindent 2 }}: + {{- toYaml ( $files.Get $value.secretFile ) | b64enc | nindent 4}} + {{/* as of https://helm.sh/docs/chart_template_guide/accessing_files/ this will only work if you fork this chart and add files to it*/}} + {{- end }} +{{- end }} +stringData: +{{- range $key, $value := .Values.datasources }} +{{- if (hasKey $value "secret") }} +{{- $key | nindent 2 }}: | + {{- tpl (toYaml $value.secret | nindent 4) $root }} +{{- end }} +{{- end }} +{{- range $key, $value := .Values.notifiers }} +{{- if (hasKey $value "secret") }} +{{- $key | nindent 2 }}: | + {{- tpl (toYaml $value.secret | nindent 4) $root }} +{{- end }} +{{- end }} +{{- range $key, $value := .Values.alerting }} +{{ if (hasKey $value "secret") }} + {{- $key | nindent 2 }}: | + {{- tpl (toYaml $value.secret | nindent 4) $root }} + {{- end }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap-dashboard-provider.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap-dashboard-provider.yaml new file mode 100644 index 00000000..b412c4d1 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap-dashboard-provider.yaml @@ -0,0 +1,15 @@ +{{- if and .Values.sidecar.dashboards.enabled .Values.sidecar.dashboards.SCProvider }} +apiVersion: v1 +kind: ConfigMap +metadata: + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} + name: {{ include "grafana.fullname" . }}-config-dashboards + namespace: {{ include "grafana.namespace" . }} +data: + {{- include "grafana.configDashboardProviderData" . | nindent 2 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap.yaml new file mode 100644 index 00000000..7d7428be --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/configmap.yaml @@ -0,0 +1,15 @@ +{{- if .Values.createConfigmap }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +data: + {{- include "grafana.configData" . | nindent 2 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/dashboards-json-configmap.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/dashboards-json-configmap.yaml new file mode 100644 index 00000000..b96ce720 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/dashboards-json-configmap.yaml @@ -0,0 +1,38 @@ +{{- if .Values.dashboards }} +{{ $files := .Files }} +{{- range $provider, $dashboards := .Values.dashboards }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "grafana.fullname" $ }}-dashboards-{{ $provider }} + namespace: {{ include "grafana.namespace" $ }} + labels: + {{- include "grafana.labels" $ | nindent 4 }} + dashboard-provider: {{ $provider }} + {{- if $.Values.sidecar.dashboards.enabled }} + {{ $.Values.sidecar.dashboards.label }}: {{ $.Values.sidecar.dashboards.labelValue | quote }} + {{- end }} +{{- if $dashboards }} +data: +{{- $dashboardFound := false }} +{{- range $key, $value := $dashboards }} +{{- if (or (hasKey $value "json") (hasKey $value "file")) }} +{{- $dashboardFound = true }} + {{- print $key | nindent 2 }}.json: + {{- if hasKey $value "json" }} + |- + {{- $value.json | nindent 6 }} + {{- end }} + {{- if hasKey $value "file" }} + {{- toYaml ( $files.Get $value.file ) | nindent 4}} + {{- end }} +{{- end }} +{{- end }} +{{- if not $dashboardFound }} + {} +{{- end }} +{{- end }} +--- +{{- end }} + +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/deployment.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/deployment.yaml new file mode 100644 index 00000000..46c016fa --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/deployment.yaml @@ -0,0 +1,53 @@ +{{- if (and (not .Values.useStatefulSet) (or (not .Values.persistence.enabled) (eq .Values.persistence.type "pvc"))) }} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + {{- if and (not .Values.autoscaling.enabled) (.Values.replicas) }} + replicas: {{ .Values.replicas }} + {{- end }} + revisionHistoryLimit: {{ .Values.revisionHistoryLimit }} + selector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 6 }} + {{- with .Values.deploymentStrategy }} + strategy: + {{- toYaml . | trim | nindent 4 }} + {{- end }} + template: + metadata: + labels: + {{- include "grafana.selectorLabels" . | nindent 8 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + annotations: + checksum/config: {{ include "grafana.configData" . | sha256sum }} + {{- if .Values.dashboards }} + checksum/dashboards-json-config: {{ include (print $.Template.BasePath "/dashboards-json-configmap.yaml") . | sha256sum }} + {{- end }} + checksum/sc-dashboard-provider-config: {{ include "grafana.configDashboardProviderData" . | sha256sum }} + {{- if and (or (and (not .Values.admin.existingSecret) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD)) (and .Values.ldap.enabled (not .Values.ldap.existingSecret))) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + checksum/secret: {{ include "grafana.secretsData" . | sha256sum }} + {{- end }} + {{- if .Values.envRenderSecret }} + checksum/secret-env: {{ tpl (toYaml .Values.envRenderSecret) . | sha256sum }} + {{- end }} + kubectl.kubernetes.io/default-container: {{ .Chart.Name }} + {{- with .Values.podAnnotations }} + {{- toYaml . | nindent 8 }} + {{- end }} + spec: + {{- include "grafana.pod" . | nindent 6 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/extra-manifests.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/extra-manifests.yaml new file mode 100644 index 00000000..a9bb3b6b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/extra-manifests.yaml @@ -0,0 +1,4 @@ +{{ range .Values.extraObjects }} +--- +{{ tpl (toYaml .) $ }} +{{ end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/headless-service.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/headless-service.yaml new file mode 100644 index 00000000..3028589d --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/headless-service.yaml @@ -0,0 +1,22 @@ +{{- $sts := list "sts" "StatefulSet" "statefulset" -}} +{{- if or .Values.headlessService (and .Values.persistence.enabled (not .Values.persistence.existingClaim) (has .Values.persistence.type $sts)) }} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "grafana.fullname" . }}-headless + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + clusterIP: None + selector: + {{- include "grafana.selectorLabels" . | nindent 4 }} + type: ClusterIP + ports: + - name: {{ .Values.gossipPortName }}-tcp + port: 9094 +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/hpa.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/hpa.yaml new file mode 100644 index 00000000..46bbcb49 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/hpa.yaml @@ -0,0 +1,52 @@ +{{- $sts := list "sts" "StatefulSet" "statefulset" -}} +{{- if .Values.autoscaling.enabled }} +apiVersion: {{ include "grafana.hpa.apiVersion" . }} +kind: HorizontalPodAutoscaler +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + app.kubernetes.io/name: {{ include "grafana.name" . }} + helm.sh/chart: {{ include "grafana.chart" . }} + app.kubernetes.io/managed-by: {{ .Release.Service }} + app.kubernetes.io/instance: {{ .Release.Name }} +spec: + scaleTargetRef: + apiVersion: apps/v1 + {{- if has .Values.persistence.type $sts }} + kind: StatefulSet + {{- else }} + kind: Deployment + {{- end }} + name: {{ include "grafana.fullname" . }} + minReplicas: {{ .Values.autoscaling.minReplicas }} + maxReplicas: {{ .Values.autoscaling.maxReplicas }} + metrics: + {{- if .Values.autoscaling.targetMemory }} + - type: Resource + resource: + name: memory + {{- if eq (include "grafana.hpa.apiVersion" .) "autoscaling/v2beta1" }} + targetAverageUtilization: {{ .Values.autoscaling.targetMemory }} + {{- else }} + target: + type: Utilization + averageUtilization: {{ .Values.autoscaling.targetMemory }} + {{- end }} + {{- end }} + {{- if .Values.autoscaling.targetCPU }} + - type: Resource + resource: + name: cpu + {{- if eq (include "grafana.hpa.apiVersion" .) "autoscaling/v2beta1" }} + targetAverageUtilization: {{ .Values.autoscaling.targetCPU }} + {{- else }} + target: + type: Utilization + averageUtilization: {{ .Values.autoscaling.targetCPU }} + {{- end }} + {{- end }} + {{- if .Values.autoscaling.behavior }} + behavior: {{ toYaml .Values.autoscaling.behavior | nindent 4 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-deployment.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-deployment.yaml new file mode 100644 index 00000000..28231b80 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-deployment.yaml @@ -0,0 +1,131 @@ +{{ if .Values.imageRenderer.enabled }} +{{- $root := . -}} +apiVersion: apps/v1 +kind: Deployment +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.imageRenderer.labels" . | nindent 4 }} + {{- with .Values.imageRenderer.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.imageRenderer.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + {{- if and (not .Values.imageRenderer.autoscaling.enabled) (.Values.imageRenderer.replicas) }} + replicas: {{ .Values.imageRenderer.replicas }} + {{- end }} + revisionHistoryLimit: {{ .Values.imageRenderer.revisionHistoryLimit }} + selector: + matchLabels: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 6 }} + + {{- with .Values.imageRenderer.deploymentStrategy }} + strategy: + {{- toYaml . | trim | nindent 4 }} + {{- end }} + template: + metadata: + labels: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 8 }} + {{- with .Values.imageRenderer.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }} + {{- with .Values.imageRenderer.podAnnotations }} + {{- toYaml . | nindent 8 }} + {{- end }} + spec: + {{- with .Values.imageRenderer.schedulerName }} + schedulerName: "{{ . }}" + {{- end }} + {{- with .Values.imageRenderer.serviceAccountName }} + serviceAccountName: "{{ . }}" + {{- end }} + {{- with .Values.imageRenderer.securityContext }} + securityContext: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.imageRenderer.hostAliases }} + hostAliases: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.imageRenderer.priorityClassName }} + priorityClassName: {{ . }} + {{- end }} + {{- with .Values.imageRenderer.image.pullSecrets }} + imagePullSecrets: + {{- range . }} + - name: {{ tpl . $root }} + {{- end}} + {{- end }} + containers: + - name: {{ .Chart.Name }}-image-renderer + {{- $registry := include "system_default_registry" | default .Values.imageRenderer.image.registry -}} + {{- if .Values.imageRenderer.image.sha }} + image: "{{ $registry }}{{ .Values.imageRenderer.image.repository }}:{{ .Values.imageRenderer.image.tag }}@sha256:{{ .Values.imageRenderer.image.sha }}" + {{- else }} + image: "{{ $registry }}{{ .Values.imageRenderer.image.repository }}:{{ .Values.imageRenderer.image.tag }}" + {{- end }} + imagePullPolicy: {{ .Values.imageRenderer.image.pullPolicy }} + {{- if .Values.imageRenderer.command }} + command: + {{- range .Values.imageRenderer.command }} + - {{ . }} + {{- end }} + {{- end}} + ports: + - name: {{ .Values.imageRenderer.service.portName }} + containerPort: {{ .Values.imageRenderer.service.targetPort }} + protocol: TCP + livenessProbe: + httpGet: + path: / + port: {{ .Values.imageRenderer.service.portName }} + env: + - name: HTTP_PORT + value: {{ .Values.imageRenderer.service.targetPort | quote }} + {{- if .Values.imageRenderer.serviceMonitor.enabled }} + - name: ENABLE_METRICS + value: "true" + {{- end }} + {{- range $key, $value := .Values.imageRenderer.envValueFrom }} + - name: {{ $key | quote }} + valueFrom: + {{- tpl (toYaml $value) $ | nindent 16 }} + {{- end }} + {{- range $key, $value := .Values.imageRenderer.env }} + - name: {{ $key | quote }} + value: {{ $value | quote }} + {{- end }} + {{- with .Values.imageRenderer.containerSecurityContext }} + securityContext: + {{- toYaml . | nindent 12 }} + {{- end }} + volumeMounts: + - mountPath: /tmp + name: image-renderer-tmpfs + {{- with .Values.imageRenderer.resources }} + resources: + {{- toYaml . | nindent 12 }} + {{- end }} + {{- with .Values.imageRenderer.nodeSelector }} + nodeSelector: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.imageRenderer.affinity }} + affinity: + {{- tpl (toYaml .) $root | nindent 8 }} + {{- end }} + {{- with .Values.imageRenderer.tolerations }} + tolerations: + {{- toYaml . | nindent 8 }} + {{- end }} + volumes: + - name: image-renderer-tmpfs + emptyDir: {} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-hpa.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-hpa.yaml new file mode 100644 index 00000000..b0f0059b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-hpa.yaml @@ -0,0 +1,47 @@ +{{- if and .Values.imageRenderer.enabled .Values.imageRenderer.autoscaling.enabled }} +apiVersion: {{ include "grafana.hpa.apiVersion" . }} +kind: HorizontalPodAutoscaler +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer + namespace: {{ include "grafana.namespace" . }} + labels: + app.kubernetes.io/name: {{ include "grafana.name" . }}-image-renderer + helm.sh/chart: {{ include "grafana.chart" . }} + app.kubernetes.io/managed-by: {{ .Release.Service }} + app.kubernetes.io/instance: {{ .Release.Name }} +spec: + scaleTargetRef: + apiVersion: apps/v1 + kind: Deployment + name: {{ include "grafana.fullname" . }}-image-renderer + minReplicas: {{ .Values.imageRenderer.autoscaling.minReplicas }} + maxReplicas: {{ .Values.imageRenderer.autoscaling.maxReplicas }} + metrics: + {{- if .Values.imageRenderer.autoscaling.targetMemory }} + - type: Resource + resource: + name: memory + {{- if eq (include "grafana.hpa.apiVersion" .) "autoscaling/v2beta1" }} + targetAverageUtilization: {{ .Values.imageRenderer.autoscaling.targetMemory }} + {{- else }} + target: + type: Utilization + averageUtilization: {{ .Values.imageRenderer.autoscaling.targetMemory }} + {{- end }} + {{- end }} + {{- if .Values.imageRenderer.autoscaling.targetCPU }} + - type: Resource + resource: + name: cpu + {{- if eq (include "grafana.hpa.apiVersion" .) "autoscaling/v2beta1" }} + targetAverageUtilization: {{ .Values.imageRenderer.autoscaling.targetCPU }} + {{- else }} + target: + type: Utilization + averageUtilization: {{ .Values.imageRenderer.autoscaling.targetCPU }} + {{- end }} + {{- end }} + {{- if .Values.imageRenderer.autoscaling.behavior }} + behavior: {{ toYaml .Values.imageRenderer.autoscaling.behavior | nindent 4 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-network-policy.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-network-policy.yaml new file mode 100644 index 00000000..d1a0eb31 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-network-policy.yaml @@ -0,0 +1,79 @@ +{{- if and .Values.imageRenderer.enabled .Values.imageRenderer.networkPolicy.limitIngress }} +--- +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer-ingress + namespace: {{ include "grafana.namespace" . }} + annotations: + comment: Limit image-renderer ingress traffic from grafana +spec: + podSelector: + matchLabels: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 6 }} + {{- with .Values.imageRenderer.podLabels }} + {{- toYaml . | nindent 6 }} + {{- end }} + + policyTypes: + - Ingress + ingress: + - ports: + - port: {{ .Values.imageRenderer.service.targetPort }} + protocol: TCP + from: + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: {{ include "grafana.namespace" . }} + podSelector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 14 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 14 }} + {{- end }} + {{- with .Values.imageRenderer.networkPolicy.extraIngressSelectors -}} + {{ toYaml . | nindent 8 }} + {{- end }} +{{- end }} + +{{- if and .Values.imageRenderer.enabled .Values.imageRenderer.networkPolicy.limitEgress }} +--- +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer-egress + namespace: {{ include "grafana.namespace" . }} + annotations: + comment: Limit image-renderer egress traffic to grafana +spec: + podSelector: + matchLabels: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 6 }} + {{- with .Values.imageRenderer.podLabels }} + {{- toYaml . | nindent 6 }} + {{- end }} + + policyTypes: + - Egress + egress: + # allow dns resolution + - ports: + - port: 53 + protocol: UDP + - port: 53 + protocol: TCP + # talk only to grafana + - ports: + - port: {{ .Values.service.targetPort }} + protocol: TCP + to: + - namespaceSelector: + matchLabels: + name: {{ include "grafana.namespace" . }} + podSelector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 14 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 14 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-service.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-service.yaml new file mode 100644 index 00000000..f8da127c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-service.yaml @@ -0,0 +1,31 @@ +{{- if and .Values.imageRenderer.enabled .Values.imageRenderer.service.enabled }} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.imageRenderer.labels" . | nindent 4 }} + {{- with .Values.imageRenderer.service.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.imageRenderer.service.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + type: ClusterIP + {{- with .Values.imageRenderer.service.clusterIP }} + clusterIP: {{ . }} + {{- end }} + ports: + - name: {{ .Values.imageRenderer.service.portName }} + port: {{ .Values.imageRenderer.service.port }} + protocol: TCP + targetPort: {{ .Values.imageRenderer.service.targetPort }} + {{- with .Values.imageRenderer.appProtocol }} + appProtocol: {{ . }} + {{- end }} + selector: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 4 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-servicemonitor.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-servicemonitor.yaml new file mode 100644 index 00000000..5d9f09d2 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/image-renderer-servicemonitor.yaml @@ -0,0 +1,48 @@ +{{- if .Values.imageRenderer.serviceMonitor.enabled }} +--- +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "grafana.fullname" . }}-image-renderer + {{- if .Values.imageRenderer.serviceMonitor.namespace }} + namespace: {{ tpl .Values.imageRenderer.serviceMonitor.namespace . }} + {{- else }} + namespace: {{ include "grafana.namespace" . }} + {{- end }} + labels: + {{- include "grafana.imageRenderer.labels" . | nindent 4 }} + {{- with .Values.imageRenderer.serviceMonitor.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + endpoints: + - port: {{ .Values.imageRenderer.service.portName }} + {{- with .Values.imageRenderer.serviceMonitor.interval }} + interval: {{ . }} + {{- end }} + {{- with .Values.imageRenderer.serviceMonitor.scrapeTimeout }} + scrapeTimeout: {{ . }} + {{- end }} + honorLabels: true + path: {{ .Values.imageRenderer.serviceMonitor.path }} + scheme: {{ .Values.imageRenderer.serviceMonitor.scheme }} + {{- with .Values.imageRenderer.serviceMonitor.tlsConfig }} + tlsConfig: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- with .Values.imageRenderer.serviceMonitor.relabelings }} + relabelings: + {{- toYaml . | nindent 6 }} + {{- end }} + jobLabel: "{{ .Release.Name }}-image-renderer" + selector: + matchLabels: + {{- include "grafana.imageRenderer.selectorLabels" . | nindent 6 }} + namespaceSelector: + matchNames: + - {{ include "grafana.namespace" . }} + {{- with .Values.imageRenderer.serviceMonitor.targetLabels }} + targetLabels: + {{- toYaml . | nindent 4 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/ingress.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/ingress.yaml new file mode 100644 index 00000000..b2ffd810 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/ingress.yaml @@ -0,0 +1,78 @@ +{{- if .Values.ingress.enabled -}} +{{- $ingressApiIsStable := eq (include "grafana.ingress.isStable" .) "true" -}} +{{- $ingressSupportsIngressClassName := eq (include "grafana.ingress.supportsIngressClassName" .) "true" -}} +{{- $ingressSupportsPathType := eq (include "grafana.ingress.supportsPathType" .) "true" -}} +{{- $fullName := include "grafana.fullname" . -}} +{{- $servicePort := .Values.service.port -}} +{{- $ingressPath := .Values.ingress.path -}} +{{- $ingressPathType := .Values.ingress.pathType -}} +{{- $extraPaths := .Values.ingress.extraPaths -}} +apiVersion: {{ include "grafana.ingress.apiVersion" . }} +kind: Ingress +metadata: + name: {{ $fullName }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.ingress.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.ingress.annotations }} + annotations: + {{- range $key, $value := . }} + {{ $key }}: {{ tpl $value $ | quote }} + {{- end }} + {{- end }} +spec: + {{- if and $ingressSupportsIngressClassName .Values.ingress.ingressClassName }} + ingressClassName: {{ .Values.ingress.ingressClassName }} + {{- end -}} + {{- with .Values.ingress.tls }} + tls: + {{- tpl (toYaml .) $ | nindent 4 }} + {{- end }} + rules: + {{- if .Values.ingress.hosts }} + {{- range .Values.ingress.hosts }} + - host: {{ tpl . $ | quote }} + http: + paths: + {{- with $extraPaths }} + {{- toYaml . | nindent 10 }} + {{- end }} + - path: {{ $ingressPath }} + {{- if $ingressSupportsPathType }} + pathType: {{ $ingressPathType }} + {{- end }} + backend: + {{- if $ingressApiIsStable }} + service: + name: {{ $fullName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $fullName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- end }} + {{- else }} + - http: + paths: + - backend: + {{- if $ingressApiIsStable }} + service: + name: {{ $fullName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $fullName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- with $ingressPath }} + path: {{ . }} + {{- end }} + {{- if $ingressSupportsPathType }} + pathType: {{ $ingressPathType }} + {{- end }} + {{- end -}} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/networkpolicy.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/networkpolicy.yaml new file mode 100644 index 00000000..4cd3ed69 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/networkpolicy.yaml @@ -0,0 +1,61 @@ +{{- if .Values.networkPolicy.enabled }} +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + policyTypes: + {{- if .Values.networkPolicy.ingress }} + - Ingress + {{- end }} + {{- if .Values.networkPolicy.egress.enabled }} + - Egress + {{- end }} + podSelector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 6 }} + + {{- if .Values.networkPolicy.egress.enabled }} + egress: + {{- if not .Values.networkPolicy.egress.blockDNSResolution }} + - ports: + - port: 53 + protocol: UDP + {{- end }} + - ports: + {{ .Values.networkPolicy.egress.ports | toJson }} + {{- with .Values.networkPolicy.egress.to }} + to: + {{- toYaml . | nindent 12 }} + {{- end }} + {{- end }} + {{- if .Values.networkPolicy.ingress }} + ingress: + - ports: + - port: {{ .Values.service.targetPort }} + {{- if not .Values.networkPolicy.allowExternal }} + from: + - podSelector: + matchLabels: + {{ include "grafana.fullname" . }}-client: "true" + {{- with .Values.networkPolicy.explicitNamespacesSelector }} + - namespaceSelector: + {{- toYaml . | nindent 12 }} + {{- end }} + - podSelector: + matchLabels: + {{- include "grafana.labels" . | nindent 14 }} + role: read + {{- end }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/nginx-config.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/nginx-config.yaml new file mode 100644 index 00000000..557471f6 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/nginx-config.yaml @@ -0,0 +1,94 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: grafana-nginx-proxy-config + namespace: {{ template "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} +data: + nginx.conf: |- + worker_processes auto; + error_log /dev/stdout warn; + pid /var/cache/nginx/nginx.pid; + + events { + worker_connections 1024; + } + + http { + include /etc/nginx/mime.types; + log_format main '[$time_local - $status] $remote_addr - $remote_user $request ($http_referer)'; + + proxy_connect_timeout 10; + proxy_read_timeout 180; + proxy_send_timeout 5; + proxy_buffering off; + proxy_cache_path /var/cache/nginx/cache levels=1:2 keys_zone=my_zone:100m inactive=1d max_size=10g; + + map $http_upgrade $connection_upgrade { + default upgrade; + '' close; + } + + server { + listen 8080; + access_log off; + + gzip on; + gzip_min_length 1k; + gzip_comp_level 2; + gzip_types text/plain application/javascript application/x-javascript text/css application/xml text/javascript image/jpeg image/gif image/png; + gzip_vary on; + gzip_disable "MSIE [1-6]\."; + + proxy_set_header Host $host; + + location /api/dashboards { + proxy_pass http://localhost:3000; + } + + location /api/search { + proxy_pass http://localhost:3000; + + sub_filter_types application/json; + sub_filter_once off; + } + + location /api/live/ { + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection $connection_upgrade; + proxy_set_header Host $http_host; + proxy_pass http://localhost:3000; + } + + location / { + proxy_cache my_zone; + proxy_cache_valid 200 302 1d; + proxy_cache_valid 301 30d; + proxy_cache_valid any 5m; + proxy_cache_bypass $http_cache_control; + add_header X-Proxy-Cache $upstream_cache_status; + add_header Cache-Control "public"; + + proxy_pass http://localhost:3000/; + + sub_filter_once off; + + {{- if eq .Values.global.cattle.clusterId "local" -}} + sub_filter '"appSubUrl":""' '"appSubUrl":"/api/v1/namespaces/{{ template "grafana.namespace" . }}/services/http:{{ template "grafana.fullname" . }}:{{ .Values.service.port }}/proxy"'; + {{- else -}} + sub_filter '"appSubUrl":""' '"appSubUrl":"/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "grafana.namespace" . }}/services/http:{{ template "grafana.fullname" . }}:{{ .Values.service.port }}/proxy"'; + {{- end -}} + + sub_filter ':"/avatar/' ':"avatar/'; + + if ($request_filename ~ .*\.(?:js|css|jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm)$) { + expires 90d; + } + + rewrite ^/k8s/clusters/.*/proxy(.*) /$1 break; + + } + } + } diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/poddisruptionbudget.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/poddisruptionbudget.yaml new file mode 100644 index 00000000..05251214 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/poddisruptionbudget.yaml @@ -0,0 +1,22 @@ +{{- if .Values.podDisruptionBudget }} +apiVersion: {{ include "grafana.podDisruptionBudget.apiVersion" . }} +kind: PodDisruptionBudget +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + {{- with .Values.podDisruptionBudget.minAvailable }} + minAvailable: {{ . }} + {{- end }} + {{- with .Values.podDisruptionBudget.maxUnavailable }} + maxUnavailable: {{ . }} + {{- end }} + selector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 6 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/podsecuritypolicy.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/podsecuritypolicy.yaml new file mode 100644 index 00000000..973caccd --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/podsecuritypolicy.yaml @@ -0,0 +1,45 @@ +{{- if and (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ include "grafana.fullname" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} +{{- if .Values.rbac.pspAnnotations }} + annotations: {{ toYaml .Values.rbac.pspAnnotations | nindent 4 }} +{{- end }} +spec: + privileged: false + allowPrivilegeEscalation: false + requiredDropCapabilities: + # Default set from Docker, with DAC_OVERRIDE and CHOWN + - ALL + volumes: + - 'configMap' + - 'emptyDir' + - 'projected' + - 'csi' + - 'secret' + - 'downwardAPI' + - 'persistentVolumeClaim' + hostNetwork: false + hostIPC: false + hostPID: false + runAsUser: + rule: 'RunAsAny' + seLinux: + rule: 'RunAsAny' + supplementalGroups: + rule: 'MustRunAs' + ranges: + # Forbid adding the root group. + - min: 1 + max: 65535 + fsGroup: + rule: 'MustRunAs' + ranges: + # Forbid adding the root group. + - min: 1 + max: 65535 + readOnlyRootFilesystem: false +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/pvc.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/pvc.yaml new file mode 100644 index 00000000..c9b23430 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/pvc.yaml @@ -0,0 +1,41 @@ +{{- if and .Values.persistence.enabled (not .Values.persistence.existingClaim) (eq .Values.persistence.type "pvc")}} +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.persistence.extraPvcLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.persistence.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.persistence.finalizers }} + finalizers: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + accessModes: +{{- $_ := required "Must provide at least one access mode for persistent volumes used by Grafana" .Values.persistence.accessModes }} +{{- $_ := required "Must provide at least one access mode for persistent volumes used by Grafana" (first .Values.persistence.accessModes) }} + {{- range .Values.persistence.accessModes }} + - {{ . | quote }} + {{- end }} + resources: + requests: + storage: {{ .Values.persistence.size | quote }} + {{- if (lookup "v1" "PersistentVolumeClaim" (include "grafana.namespace" .) (include "grafana.fullname" .)) }} + volumeName: {{ (lookup "v1" "PersistentVolumeClaim" (include "grafana.namespace" .) (include "grafana.fullname" .)).spec.volumeName }} + {{- end }} + {{- with .Values.persistence.storageClassName }} + storageClassName: {{ . }} + {{- end }} + {{- with .Values.persistence.selectorLabels }} + selector: + matchLabels: + {{- toYaml . | nindent 6 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/role.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/role.yaml new file mode 100644 index 00000000..469b6f4e --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/role.yaml @@ -0,0 +1,32 @@ +{{- if and .Values.rbac.create (not .Values.rbac.useExistingRole) -}} +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +{{- if or (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) (and .Values.rbac.namespaced (or .Values.sidecar.dashboards.enabled .Values.sidecar.datasources.enabled .Values.sidecar.plugins.enabled .Values.rbac.extraRoleRules)) }} +rules: + {{- if and (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") }} + - apiGroups: ['extensions'] + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: [{{ include "grafana.fullname" . }}] + {{- end }} + {{- if and .Values.rbac.namespaced (or .Values.sidecar.dashboards.enabled .Values.sidecar.datasources.enabled .Values.sidecar.plugins.enabled) }} + - apiGroups: [""] # "" indicates the core API group + resources: ["configmaps", "secrets"] + verbs: ["get", "watch", "list"] + {{- end }} + {{- with .Values.rbac.extraRoleRules }} + {{- toYaml . | nindent 2 }} + {{- end}} +{{- else }} +rules: [] +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/rolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/rolebinding.yaml new file mode 100644 index 00000000..58f77c6b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/rolebinding.yaml @@ -0,0 +1,25 @@ +{{- if .Values.rbac.create }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + {{- if .Values.rbac.useExistingRole }} + name: {{ .Values.rbac.useExistingRole }} + {{- else }} + name: {{ include "grafana.fullname" . }} + {{- end }} +subjects: +- kind: ServiceAccount + name: {{ include "grafana.serviceAccountName" . }} + namespace: {{ include "grafana.namespace" . }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret-env.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret-env.yaml new file mode 100644 index 00000000..eb14aac7 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret-env.yaml @@ -0,0 +1,14 @@ +{{- if .Values.envRenderSecret }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ include "grafana.fullname" . }}-env + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} +type: Opaque +data: +{{- range $key, $val := .Values.envRenderSecret }} + {{ $key }}: {{ tpl ($val | toString) $ | b64enc | quote }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret.yaml new file mode 100644 index 00000000..fd2ca50f --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/secret.yaml @@ -0,0 +1,16 @@ +{{- if or (and (not .Values.admin.existingSecret) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION)) (and .Values.ldap.enabled (not .Values.ldap.existingSecret)) }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +type: Opaque +data: + {{- include "grafana.secretsData" . | nindent 2 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/service.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/service.yaml new file mode 100644 index 00000000..e9396a15 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/service.yaml @@ -0,0 +1,61 @@ +{{- if .Values.service.enabled }} +{{- $root := . }} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.service.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.service.annotations }} + annotations: + {{- tpl (toYaml . | nindent 4) $root }} + {{- end }} +spec: + {{- if (or (eq .Values.service.type "ClusterIP") (empty .Values.service.type)) }} + type: ClusterIP + {{- with .Values.service.clusterIP }} + clusterIP: {{ . }} + {{- end }} + {{- else if eq .Values.service.type "LoadBalancer" }} + type: LoadBalancer + {{- with .Values.service.loadBalancerIP }} + loadBalancerIP: {{ . }} + {{- end }} + {{- with .Values.service.loadBalancerClass }} + loadBalancerClass: {{ . }} + {{- end }} + {{- with .Values.service.loadBalancerSourceRanges }} + loadBalancerSourceRanges: + {{- toYaml . | nindent 4 }} + {{- end }} + {{- else }} + type: {{ .Values.service.type }} + {{- end }} + {{- with .Values.service.externalIPs }} + externalIPs: + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.service.externalTrafficPolicy }} + externalTrafficPolicy: {{ . }} + {{- end }} + ports: + - name: {{ .Values.service.portName }} + port: {{ .Values.service.port }} + protocol: TCP + targetPort: {{ .Values.service.targetPort }} + {{- with .Values.service.appProtocol }} + appProtocol: {{ . }} + {{- end }} + {{- if (and (eq .Values.service.type "NodePort") (not (empty .Values.service.nodePort))) }} + nodePort: {{ .Values.service.nodePort }} + {{- end }} + {{- with .Values.extraExposePorts }} + {{- tpl (toYaml . | nindent 4) $root }} + {{- end }} + selector: + {{- include "grafana.selectorLabels" . | nindent 4 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/serviceaccount.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/serviceaccount.yaml new file mode 100644 index 00000000..ffca0717 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/serviceaccount.yaml @@ -0,0 +1,17 @@ +{{- if .Values.serviceAccount.create }} +apiVersion: v1 +kind: ServiceAccount +automountServiceAccountToken: {{ .Values.serviceAccount.autoMount | default .Values.serviceAccount.automountServiceAccountToken }} +metadata: + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.serviceAccount.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.serviceAccount.annotations }} + annotations: + {{- tpl (toYaml . | nindent 4) $ }} + {{- end }} + name: {{ include "grafana.serviceAccountName" . }} + namespace: {{ include "grafana.namespace" . }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/servicemonitor.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/servicemonitor.yaml new file mode 100644 index 00000000..b321b126 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/servicemonitor.yaml @@ -0,0 +1,68 @@ +{{- if .Values.serviceMonitor.enabled }} +--- +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "grafana.fullname" . }} + {{- if .Values.serviceMonitor.namespace }} + namespace: {{ tpl .Values.serviceMonitor.namespace . }} + {{- else }} + namespace: {{ include "grafana.namespace" . }} + {{- end }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.serviceMonitor.labels }} + {{- tpl (toYaml . | nindent 4) $ }} + {{- end }} +spec: + endpoints: + - port: {{ .Values.service.portName }} + {{- with .Values.serviceMonitor.interval }} + interval: {{ . }} + {{- end }} + {{- with .Values.serviceMonitor.scrapeTimeout }} + scrapeTimeout: {{ . }} + {{- end }} + honorLabels: true + path: {{ .Values.serviceMonitor.path }} + scheme: {{ .Values.serviceMonitor.scheme }} + {{- with .Values.serviceMonitor.tlsConfig }} + tlsConfig: + {{- toYaml . | nindent 6 }} + {{- end }} + metricRelabelings: + {{- if .Values.serviceMonitor.metricRelabelings }} + {{- toYaml .Values.serviceMonitor.metricRelabelings | nindent 6 }} + {{- end }} + {{ if .Values.global.cattle.clusterId }} + - sourceLabels: [__address__] + targetLabel: cluster_id + replacement: {{ .Values.global.cattle.clusterId }} + {{- end }} + {{ if .Values.global.cattle.clusterName }} + - sourceLabels: [__address__] + targetLabel: cluster_name + replacement: {{ .Values.global.cattle.clusterName }} + {{- end }} + {{- if .Values.serviceMonitor.relabelings }} + {{- with .Values.serviceMonitor.relabelings }} + relabelings: + {{- toYaml . | nindent 6 }} + {{- end }} + {{- end }} + {{- with .Values.serviceMonitor.metricRelabelings }} + metricRelabelings: + {{- toYaml . | nindent 6 }} + {{- end }} + jobLabel: "{{ .Release.Name }}" + selector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 6 }} + namespaceSelector: + matchNames: + - {{ include "grafana.namespace" . }} + {{- with .Values.serviceMonitor.targetLabels }} + targetLabels: + {{- toYaml . | nindent 4 }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/statefulset.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/statefulset.yaml new file mode 100644 index 00000000..49278083 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/statefulset.yaml @@ -0,0 +1,58 @@ +{{- $sts := list "sts" "StatefulSet" "statefulset" -}} +{{- if (or (.Values.useStatefulSet) (and .Values.persistence.enabled (not .Values.persistence.existingClaim) (has .Values.persistence.type $sts)))}} +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: {{ include "grafana.fullname" . }} + namespace: {{ include "grafana.namespace" . }} + labels: + {{- include "grafana.labels" . | nindent 4 }} + {{- with .Values.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + replicas: {{ .Values.replicas }} + selector: + matchLabels: + {{- include "grafana.selectorLabels" . | nindent 6 }} + serviceName: {{ include "grafana.fullname" . }}-headless + template: + metadata: + labels: + {{- include "grafana.selectorLabels" . | nindent 8 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + annotations: + checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }} + checksum/dashboards-json-config: {{ include (print $.Template.BasePath "/dashboards-json-configmap.yaml") . | sha256sum }} + checksum/sc-dashboard-provider-config: {{ include (print $.Template.BasePath "/configmap-dashboard-provider.yaml") . | sha256sum }} + {{- if and (or (and (not .Values.admin.existingSecret) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD__FILE) (not .Values.env.GF_SECURITY_ADMIN_PASSWORD)) (and .Values.ldap.enabled (not .Values.ldap.existingSecret))) (not .Values.env.GF_SECURITY_DISABLE_INITIAL_ADMIN_CREATION) }} + checksum/secret: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }} + {{- end }} + kubectl.kubernetes.io/default-container: {{ .Chart.Name }} + {{- with .Values.podAnnotations }} + {{- toYaml . | nindent 8 }} + {{- end }} + spec: + {{- include "grafana.pod" . | nindent 6 }} + {{- if .Values.persistence.enabled}} + volumeClaimTemplates: + - metadata: + name: storage + spec: +{{- $_ := required "Must provide at least one access mode for persistent volumes used by Grafana" .Values.persistence.accessModes }} +{{- $_ := required "Must provide at least one access mode for persistent volumes used by Grafana" (first .Values.persistence.accessModes) }} + accessModes: {{ .Values.persistence.accessModes }} + storageClassName: {{ .Values.persistence.storageClassName }} + resources: + requests: + storage: {{ required "Must provide size for persistent volumes used by Grafana" .Values.persistence.size }} + {{- with .Values.persistence.selectorLabels }} + selector: + matchLabels: + {{- toYaml . | nindent 10 }} + {{- end }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-configmap.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-configmap.yaml new file mode 100644 index 00000000..4eec1b6a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-configmap.yaml @@ -0,0 +1,20 @@ +{{- if .Values.testFramework.enabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ include "grafana.fullname" . }}-test + namespace: {{ include "grafana.namespace" . }} + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" + labels: + {{- include "grafana.labels" . | nindent 4 }} +data: + run.sh: |- + @test "Test Health" { + url="http://{{ include "grafana.fullname" . }}/api/health" + + code=$(wget --server-response --spider --timeout 10 --tries 1 ${url} 2>&1 | awk '/^ HTTP/{print $2}') + [ "$code" == "200" ] + } +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-podsecuritypolicy.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-podsecuritypolicy.yaml new file mode 100644 index 00000000..70a0a884 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-podsecuritypolicy.yaml @@ -0,0 +1,32 @@ +{{- if and (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") .Values.testFramework.enabled (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ include "grafana.fullname" . }}-test + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" + labels: + {{- include "grafana.labels" . | nindent 4 }} +spec: + allowPrivilegeEscalation: true + privileged: false + hostNetwork: false + hostIPC: false + hostPID: false + fsGroup: + rule: RunAsAny + seLinux: + rule: RunAsAny + supplementalGroups: + rule: RunAsAny + runAsUser: + rule: RunAsAny + volumes: + - configMap + - downwardAPI + - emptyDir + - projected + - csi + - secret +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-role.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-role.yaml new file mode 100644 index 00000000..976418b1 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-role.yaml @@ -0,0 +1,17 @@ +{{- if and (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") .Values.testFramework.enabled (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ include "grafana.fullname" . }}-test + namespace: {{ include "grafana.namespace" . }} + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" + labels: + {{- include "grafana.labels" . | nindent 4 }} +rules: + - apiGroups: ['policy'] + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: [{{ include "grafana.fullname" . }}-test] +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-rolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-rolebinding.yaml new file mode 100644 index 00000000..509566ec --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-rolebinding.yaml @@ -0,0 +1,20 @@ +{{- if and (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") .Values.testFramework.enabled (or .Values.global.cattle.psp.enabled .Values.rbac.pspEnabled) }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: {{ include "grafana.fullname" . }}-test + namespace: {{ include "grafana.namespace" . }} + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" + labels: + {{- include "grafana.labels" . | nindent 4 }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: {{ include "grafana.fullname" . }}-test +subjects: + - kind: ServiceAccount + name: {{ include "grafana.serviceAccountNameTest" . }} + namespace: {{ include "grafana.namespace" . }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-serviceaccount.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-serviceaccount.yaml new file mode 100644 index 00000000..38fba359 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test-serviceaccount.yaml @@ -0,0 +1,12 @@ +{{- if and .Values.testFramework.enabled .Values.serviceAccount.create }} +apiVersion: v1 +kind: ServiceAccount +metadata: + labels: + {{- include "grafana.labels" . | nindent 4 }} + name: {{ include "grafana.serviceAccountNameTest" . }} + namespace: {{ include "grafana.namespace" . }} + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test.yaml new file mode 100644 index 00000000..83aaa185 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/templates/tests/test.yaml @@ -0,0 +1,53 @@ +{{- if .Values.testFramework.enabled }} +{{- $root := . }} +apiVersion: v1 +kind: Pod +metadata: + name: {{ include "grafana.fullname" . }}-test + labels: + {{- include "grafana.labels" . | nindent 4 }} + annotations: + "helm.sh/hook": test-success + "helm.sh/hook-delete-policy": "before-hook-creation,hook-succeeded" + namespace: {{ include "grafana.namespace" . }} +spec: + serviceAccountName: {{ include "grafana.serviceAccountNameTest" . }} + {{- with .Values.testFramework.securityContext }} + securityContext: + {{- toYaml . | nindent 4 }} + {{- end }} + {{- if or .Values.image.pullSecrets .Values.global.imagePullSecrets }} + imagePullSecrets: + {{- include "grafana.imagePullSecrets" (dict "root" $root "imagePullSecrets" .Values.image.pullSecrets) | nindent 4 }} + {{- end }} + {{- with .Values.nodeSelector }} + nodeSelector: + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with .Values.affinity }} + affinity: + {{- tpl (toYaml .) $root | nindent 4 }} + {{- end }} + {{- with .Values.tolerations }} + tolerations: + {{- toYaml . | nindent 4 }} + {{- end }} + containers: + - name: {{ .Release.Name }}-test + image: "{{ template "system_default_registry" . | default .Values.testFramework.image.registry }}/{{ .Values.testFramework.image.repository }}:{{ .Values.testFramework.image.tag }}" + imagePullPolicy: "{{ .Values.testFramework.imagePullPolicy}}" + command: ["/opt/bats/bin/bats", "-t", "/tests/run.sh"] + volumeMounts: + - mountPath: /tests + name: tests + readOnly: true + {{- with .Values.testFramework.resources }} + resources: + {{- toYaml . | nindent 8 }} + {{- end }} + volumes: + - name: tests + configMap: + name: {{ include "grafana.fullname" . }}-test + restartPolicy: Never +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/charts/grafana/values.yaml b/charts/rancher-project-monitoring/0.4.2/charts/grafana/values.yaml new file mode 100644 index 00000000..6bb9e4f5 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/charts/grafana/values.yaml @@ -0,0 +1,1263 @@ +global: + cattle: + psp: + enabled: false + systemDefaultRegistry: "" + projectNamespaces: [] + +rbac: + create: true + ## Use an existing ClusterRole/Role (depending on rbac.namespaced false/true) + # useExistingRole: name-of-some-role + # useExistingClusterRole: name-of-some-clusterRole + pspEnabled: false + pspUseAppArmor: false + namespaced: false + extraRoleRules: [] + # - apiGroups: [] + # resources: [] + # verbs: [] + extraClusterRoleRules: [] + # - apiGroups: [] + # resources: [] + # verbs: [] +serviceAccount: + create: true + name: + nameTest: + ## ServiceAccount labels. + labels: {} + ## Service account annotations. Can be templated. + # annotations: + # eks.amazonaws.com/role-arn: arn:aws:iam::123456789000:role/iam-role-name-here + + ## autoMount is deprecated in favor of automountServiceAccountToken + # autoMount: false + automountServiceAccountToken: true + +replicas: 1 + +## Create a headless service for the deployment +headlessService: false + +## Should the service account be auto mounted on the pod +automountServiceAccountToken: true + +## Create HorizontalPodAutoscaler object for deployment type +# +autoscaling: + enabled: false +# minReplicas: 1 +# maxReplicas: 10 +# metrics: +# - type: Resource +# resource: +# name: cpu +# targetAverageUtilization: 60 +# - type: Resource +# resource: +# name: memory +# targetAverageUtilization: 60 + +## See `kubectl explain poddisruptionbudget.spec` for more +## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/ +podDisruptionBudget: {} +# apiVersion: "" +# minAvailable: 1 +# maxUnavailable: 1 + +## See `kubectl explain deployment.spec.strategy` for more +## ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy +deploymentStrategy: + type: RollingUpdate + +readinessProbe: + httpGet: + path: /api/health + port: 3000 + +livenessProbe: + httpGet: + path: /api/health + port: 3000 + initialDelaySeconds: 60 + timeoutSeconds: 30 + failureThreshold: 10 + +## Use an alternate scheduler, e.g. "stork". +## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/ +## +# schedulerName: "default-scheduler" + +image: + repository: rancher/mirrored-grafana-grafana + # Overrides the Grafana image tag whose default is the chart appVersion + tag: 10.3.3 + sha: "" + pullPolicy: IfNotPresent + + ## Optionally specify an array of imagePullSecrets. + ## Secrets must be manually created in the namespace. + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ + ## Can be templated. + ## + # pullSecrets: + # - myRegistrKeySecretName + +testFramework: + enabled: false + image: "rancher/mirrored-bats-bats" + tag: "v1.4.1" + imagePullPolicy: IfNotPresent + securityContext: + runAsNonRoot: true + runAsUser: 1000 + +securityContext: + runAsNonRoot: true + runAsUser: 472 + runAsGroup: 472 + fsGroup: 472 + +containerSecurityContext: {} + +# Enable creating the grafana configmap +createConfigmap: true + +# Extra configmaps to mount in grafana pods +# Values are templated. +extraConfigmapMounts: [] + # - name: certs-configmap + # mountPath: /etc/grafana/ssl/ + # subPath: certificates.crt # (optional) + # configMap: certs-configmap + # readOnly: true + + +extraEmptyDirMounts: [] + # - name: provisioning-notifiers + # mountPath: /etc/grafana/provisioning/notifiers + + +# Apply extra labels to common labels. +extraLabels: {} + +## Assign a PriorityClassName to pods if set +# priorityClassName: + +downloadDashboardsImage: + repository: rancher/mirrored-curlimages-curl + tag: 7.85.0 + sha: "" + pullPolicy: IfNotPresent + +downloadDashboards: + env: {} + envFromSecret: "" + resources: {} + securityContext: {} + +## Pod Annotations +# podAnnotations: {} + +## Pod Labels +# podLabels: {} + +podPortName: grafana +gossipPortName: gossip +## Deployment annotations +# annotations: {} + +## Expose the grafana service to be accessed from outside the cluster (LoadBalancer service). +## or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it. +## ref: http://kubernetes.io/docs/user-guide/services/ +## +service: + enabled: true + type: ClusterIP + loadBalancerIP: "" + loadBalancerClass: "" + loadBalancerSourceRanges: [] + port: 80 + targetPort: 3000 + # targetPort: 4181 To be used with a proxy extraContainer + ## Service annotations. Can be templated. + annotations: {} + labels: {} + portName: service + # Adds the appProtocol field to the service. This allows to work with istio protocol selection. Ex: "http" or "tcp" + appProtocol: "" + +serviceMonitor: + ## If true, a ServiceMonitor CRD is created for a prometheus operator + ## https://github.com/coreos/prometheus-operator + ## + enabled: false + path: /metrics + # namespace: monitoring (defaults to use the namespace this chart is deployed to) + labels: {} + interval: 30s + scheme: http + tlsConfig: {} + scrapeTimeout: 30s + relabelings: [] + metricRelabelings: [] + targetLabels: [] + +extraExposePorts: [] + # - name: keycloak + # port: 8080 + # targetPort: 8080 + +# overrides pod.spec.hostAliases in the grafana deployment's pods +hostAliases: [] + # - ip: "1.2.3.4" + # hostnames: + # - "my.host.com" + +ingress: + enabled: false + # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName + # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress + # ingressClassName: nginx + # Values can be templated + annotations: {} + # kubernetes.io/ingress.class: nginx + # kubernetes.io/tls-acme: "true" + labels: {} + path: / + + # pathType is only for k8s >= 1.1= + pathType: Prefix + + hosts: + - chart-example.local + ## Extra paths to prepend to every host configuration. This is useful when working with annotation based services. + extraPaths: [] + # - path: /* + # backend: + # serviceName: ssl-redirect + # servicePort: use-annotation + ## Or for k8s > 1.19 + # - path: /* + # pathType: Prefix + # backend: + # service: + # name: ssl-redirect + # port: + # name: use-annotation + + + tls: [] + # - secretName: chart-example-tls + # hosts: + # - chart-example.local + +resources: {} +# limits: +# cpu: 100m +# memory: 128Mi +# requests: +# cpu: 100m +# memory: 128Mi + +## Node labels for pod assignment +## ref: https://kubernetes.io/docs/user-guide/node-selection/ +# +nodeSelector: {} + +## Tolerations for pod assignment +## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ +## +tolerations: [] + +## Affinity for pod assignment (evaluated as template) +## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity +## +affinity: {} + +## Topology Spread Constraints +## ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ +## +topologySpreadConstraints: [] + +## Additional init containers (evaluated as template) +## ref: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ +## +extraInitContainers: [] + +## Enable an Specify container in extraContainers. This is meant to allow adding an authentication proxy to a grafana pod +extraContainers: "" +# extraContainers: | +# - name: proxy +# image: quay.io/gambol99/keycloak-proxy:latest +# args: +# - -provider=github +# - -client-id= +# - -client-secret= +# - -github-org= +# - -email-domain=* +# - -cookie-secret= +# - -http-address=http://0.0.0.0:4181 +# - -upstream-url=http://127.0.0.1:3000 +# ports: +# - name: proxy-web +# containerPort: 4181 + +## Volumes that can be used in init containers that will not be mounted to deployment pods +extraContainerVolumes: [] +# - name: volume-from-secret +# secret: +# secretName: secret-to-mount +# - name: empty-dir-volume +# emptyDir: {} + +## Enable persistence using Persistent Volume Claims +## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/ +## +persistence: + type: pvc + enabled: false + # storageClassName: default + accessModes: + - ReadWriteOnce + size: 10Gi + # annotations: {} + finalizers: + - kubernetes.io/pvc-protection + # selectorLabels: {} + ## Sub-directory of the PV to mount. Can be templated. + # subPath: "" + ## Name of an existing PVC. Can be templated. + # existingClaim: + ## Extra labels to apply to a PVC. + extraPvcLabels: {} + + ## If persistence is not enabled, this allows to mount the + ## local storage in-memory to improve performance + ## + inMemory: + enabled: false + ## The maximum usage on memory medium EmptyDir would be + ## the minimum value between the SizeLimit specified + ## here and the sum of memory limits of all containers in a pod + ## + # sizeLimit: 300Mi + +initChownData: + ## If false, data ownership will not be reset at startup + ## This allows the grafana-server to be run with an arbitrary user + ## + enabled: true + + ## initChownData container image + ## + image: + repository: rancher/mirrored-library-busybox + tag: "1.31.1" + sha: "" + pullPolicy: IfNotPresent + + ## initChownData resource requests and limits + ## Ref: http://kubernetes.io/docs/user-guide/compute-resources/ + ## + resources: {} + # limits: + # cpu: 100m + # memory: 128Mi + # requests: + # cpu: 100m + # memory: 128Mi + +# Administrator credentials when not using an existing secret (see below) +adminUser: admin +# adminPassword: strongpassword + +# Use an existing secret for the admin user. +admin: + ## Name of the secret. Can be templated. + existingSecret: "" + userKey: admin-user + passwordKey: admin-password + +## Define command to be executed at startup by grafana container +## Needed if using `vault-env` to manage secrets (ref: https://banzaicloud.com/blog/inject-secrets-into-pods-vault/) +## Default is "run.sh" as defined in grafana's Dockerfile +# command: +# - "sh" +# - "/run.sh" + +## Optionally define args if command is used +## Needed if using `hashicorp/envconsul` to manage secrets +## By default no arguments are set +# args: +# - "-secret" +# - "secret/grafana" +# - "./grafana" + +## Extra environment variables that will be pass onto deployment pods +## +## to provide grafana with access to CloudWatch on AWS EKS: +## 1. create an iam role of type "Web identity" with provider oidc.eks.* (note the provider for later) +## 2. edit the "Trust relationships" of the role, add a line inside the StringEquals clause using the +## same oidc eks provider as noted before (same as the existing line) +## also, replace NAMESPACE and prometheus-operator-grafana with the service account namespace and name +## +## "oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:sub": "system:serviceaccount:NAMESPACE:prometheus-operator-grafana", +## +## 3. attach a policy to the role, you can use a built in policy called CloudWatchReadOnlyAccess +## 4. use the following env: (replace 123456789000 and iam-role-name-here with your aws account number and role name) +## +## env: +## AWS_ROLE_ARN: arn:aws:iam::123456789000:role/iam-role-name-here +## AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/eks.amazonaws.com/serviceaccount/token +## AWS_REGION: us-east-1 +## +## 5. uncomment the EKS section in extraSecretMounts: below +## 6. uncomment the annotation section in the serviceAccount: above +## make sure to replace arn:aws:iam::123456789000:role/iam-role-name-here with your role arn + +env: {} + +## "valueFrom" environment variable references that will be added to deployment pods. Name is templated. +## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#envvarsource-v1-core +## Renders in container spec as: +## env: +## ... +## - name: +## valueFrom: +## +envValueFrom: {} + # ENV_NAME: + # configMapKeyRef: + # name: configmap-name + # key: value_key + +## The name of a secret in the same kubernetes namespace which contain values to be added to the environment +## This can be useful for auth tokens, etc. Value is templated. +envFromSecret: "" + +## Sensible environment variables that will be rendered as new secret object +## This can be useful for auth tokens, etc. +## If the secret values contains "{{", they'll need to be properly escaped so that they are not interpreted by Helm +## ref: https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function +envRenderSecret: {} + +## The names of secrets in the same kubernetes namespace which contain values to be added to the environment +## Each entry should contain a name key, and can optionally specify whether the secret must be defined with an optional key. +## Name is templated. +envFromSecrets: [] +## - name: secret-name +## prefix: prefix +## optional: true + +## The names of conifgmaps in the same kubernetes namespace which contain values to be added to the environment +## Each entry should contain a name key, and can optionally specify whether the configmap must be defined with an optional key. +## Name is templated. +## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#configmapenvsource-v1-core +envFromConfigMaps: [] +## - name: configmap-name +## prefix: prefix +## optional: true + +# Inject Kubernetes services as environment variables. +# See https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#environment-variables +enableServiceLinks: true + +## Additional grafana server secret mounts +# Defines additional mounts with secrets. Secrets must be manually created in the namespace. +extraSecretMounts: [] + # - name: secret-files + # mountPath: /etc/secrets + # secretName: grafana-secret-files + # readOnly: true + # subPath: "" + # + # for AWS EKS (cloudwatch) use the following (see also instruction in env: above) + # - name: aws-iam-token + # mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount + # readOnly: true + # projected: + # defaultMode: 420 + # sources: + # - serviceAccountToken: + # audience: sts.amazonaws.com + # expirationSeconds: 86400 + # path: token + # + # for CSI e.g. Azure Key Vault use the following + # - name: secrets-store-inline + # mountPath: /run/secrets + # readOnly: true + # csi: + # driver: secrets-store.csi.k8s.io + # readOnly: true + # volumeAttributes: + # secretProviderClass: "akv-grafana-spc" + # nodePublishSecretRef: # Only required when using service principal mode + # name: grafana-akv-creds # Only required when using service principal mode + +## Additional grafana server volume mounts +# Defines additional volume mounts. +extraVolumeMounts: [] + # - name: extra-volume-0 + # mountPath: /mnt/volume0 + # readOnly: true + # - name: extra-volume-1 + # mountPath: /mnt/volume1 + # readOnly: true + # - name: grafana-secrets + # mountPath: /mnt/volume2 + +## Additional Grafana server volumes +extraVolumes: [] + # - name: extra-volume-0 + # existingClaim: volume-claim + # - name: extra-volume-1 + # hostPath: + # path: /usr/shared/ + # type: "" + # - name: grafana-secrets + # csi: + # driver: secrets-store.csi.k8s.io + # readOnly: true + # volumeAttributes: + # secretProviderClass: "grafana-env-spc" + +## Container Lifecycle Hooks. Execute a specific bash command or make an HTTP request +lifecycleHooks: {} + # postStart: + # exec: + # command: [] + +## Pass the plugins you want installed as a list. +## +plugins: [] + # - digrich-bubblechart-panel + # - grafana-clock-panel + ## You can also use other plugin download URL, as long as they are valid zip files, + ## and specify the name of the plugin after the semicolon. Like this: + # - https://grafana.com/api/plugins/marcusolsson-json-datasource/versions/1.3.2/download;marcusolsson-json-datasource + +## Configure grafana datasources +## ref: http://docs.grafana.org/administration/provisioning/#datasources +## +datasources: {} +# datasources.yaml: +# apiVersion: 1 +# datasources: +# - name: Prometheus +# type: prometheus +# url: http://prometheus-prometheus-server +# access: proxy +# isDefault: true +# - name: CloudWatch +# type: cloudwatch +# access: proxy +# uid: cloudwatch +# editable: false +# jsonData: +# authType: default +# defaultRegion: us-east-1 +# deleteDatasources: [] +# - name: Prometheus + +## Configure grafana alerting (can be templated) +## ref: http://docs.grafana.org/administration/provisioning/#alerting +## +alerting: {} + # rules.yaml: + # apiVersion: 1 + # groups: + # - orgId: 1 + # name: '{{ .Chart.Name }}_my_rule_group' + # folder: my_first_folder + # interval: 60s + # rules: + # - uid: my_id_1 + # title: my_first_rule + # condition: A + # data: + # - refId: A + # datasourceUid: '-100' + # model: + # conditions: + # - evaluator: + # params: + # - 3 + # type: gt + # operator: + # type: and + # query: + # params: + # - A + # reducer: + # type: last + # type: query + # datasource: + # type: __expr__ + # uid: '-100' + # expression: 1==0 + # intervalMs: 1000 + # maxDataPoints: 43200 + # refId: A + # type: math + # dashboardUid: my_dashboard + # panelId: 123 + # noDataState: Alerting + # for: 60s + # annotations: + # some_key: some_value + # labels: + # team: sre_team_1 + # contactpoints.yaml: + # secret: + # apiVersion: 1 + # contactPoints: + # - orgId: 1 + # name: cp_1 + # receivers: + # - uid: first_uid + # type: pagerduty + # settings: + # integrationKey: XXX + # severity: critical + # class: ping failure + # component: Grafana + # group: app-stack + # summary: | + # {{ `{{ include "default.message" . }}` }} + +## Configure notifiers +## ref: http://docs.grafana.org/administration/provisioning/#alert-notification-channels +## +notifiers: {} +# notifiers.yaml: +# notifiers: +# - name: email-notifier +# type: email +# uid: email1 +# # either: +# org_id: 1 +# # or +# org_name: Main Org. +# is_default: true +# settings: +# addresses: an_email_address@example.com +# delete_notifiers: + +## Configure grafana dashboard providers +## ref: http://docs.grafana.org/administration/provisioning/#dashboards +## +## `path` must be /var/lib/grafana/dashboards/ +## +dashboardProviders: {} +# dashboardproviders.yaml: +# apiVersion: 1 +# providers: +# - name: 'default' +# orgId: 1 +# folder: '' +# type: file +# disableDeletion: false +# editable: true +# options: +# path: /var/lib/grafana/dashboards/default + +## Configure grafana dashboard to import +## NOTE: To use dashboards you must also enable/configure dashboardProviders +## ref: https://grafana.com/dashboards +## +## dashboards per provider, use provider name as key. +## +dashboards: {} + # default: + # some-dashboard: + # json: | + # $RAW_JSON + # custom-dashboard: + # file: dashboards/custom-dashboard.json + # prometheus-stats: + # gnetId: 2 + # revision: 2 + # datasource: Prometheus + # local-dashboard: + # url: https://example.com/repository/test.json + # token: '' + # local-dashboard-base64: + # url: https://example.com/repository/test-b64.json + # token: '' + # b64content: true + # local-dashboard-gitlab: + # url: https://example.com/repository/test-gitlab.json + # gitlabToken: '' + # local-dashboard-bitbucket: + # url: https://example.com/repository/test-bitbucket.json + # bearerToken: '' + # local-dashboard-azure: + # url: https://example.com/repository/test-azure.json + # basic: '' + # acceptHeader: '*/*' + +## Reference to external ConfigMap per provider. Use provider name as key and ConfigMap name as value. +## A provider dashboards must be defined either by external ConfigMaps or in values.yaml, not in both. +## ConfigMap data example: +## +## data: +## example-dashboard.json: | +## RAW_JSON +## +dashboardsConfigMaps: {} +# default: "" + +## Grafana's primary configuration +## NOTE: values in map will be converted to ini format +## ref: http://docs.grafana.org/installation/configuration/ +## +grafana.ini: + paths: + data: /var/lib/grafana/ + logs: /var/log/grafana + plugins: /var/lib/grafana/plugins + provisioning: /etc/grafana/provisioning + analytics: + check_for_updates: true + log: + mode: console + grafana_net: + url: https://grafana.net + server: + domain: "{{ if (and .Values.ingress.enabled .Values.ingress.hosts) }}{{ .Values.ingress.hosts | first }}{{ else }}''{{ end }}" +## grafana Authentication can be enabled with the following values on grafana.ini + # server: + # The full public facing url you use in browser, used for redirects and emails + # root_url: + # https://grafana.com/docs/grafana/latest/auth/github/#enable-github-in-grafana + # auth.github: + # enabled: false + # allow_sign_up: false + # scopes: user:email,read:org + # auth_url: https://github.com/login/oauth/authorize + # token_url: https://github.com/login/oauth/access_token + # api_url: https://api.github.com/user + # team_ids: + # allowed_organizations: + # client_id: + # client_secret: +## LDAP Authentication can be enabled with the following values on grafana.ini +## NOTE: Grafana will fail to start if the value for ldap.toml is invalid + # auth.ldap: + # enabled: true + # allow_sign_up: true + # config_file: /etc/grafana/ldap.toml + +## Grafana's LDAP configuration +## Templated by the template in _helpers.tpl +## NOTE: To enable the grafana.ini must be configured with auth.ldap.enabled +## ref: http://docs.grafana.org/installation/configuration/#auth-ldap +## ref: http://docs.grafana.org/installation/ldap/#configuration +ldap: + enabled: false + # `existingSecret` is a reference to an existing secret containing the ldap configuration + # for Grafana in a key `ldap-toml`. + existingSecret: "" + # `config` is the content of `ldap.toml` that will be stored in the created secret + config: "" + # config: |- + # verbose_logging = true + + # [[servers]] + # host = "my-ldap-server" + # port = 636 + # use_ssl = true + # start_tls = false + # ssl_skip_verify = false + # bind_dn = "uid=%s,ou=users,dc=myorg,dc=com" + +## Grafana's SMTP configuration +## NOTE: To enable, grafana.ini must be configured with smtp.enabled +## ref: http://docs.grafana.org/installation/configuration/#smtp +smtp: + # `existingSecret` is a reference to an existing secret containing the smtp configuration + # for Grafana. + existingSecret: "" + userKey: "user" + passwordKey: "password" + +## Sidecars that collect the configmaps with specified label and stores the included files them into the respective folders +## Requires at least Grafana 5 to work and can't be used together with parameters dashboardProviders, datasources and dashboards +sidecar: + image: + repository: rancher/mirrored-kiwigrid-k8s-sidecar + tag: 1.26.1 + sha: "" + imagePullPolicy: IfNotPresent + resources: {} +# limits: +# cpu: 100m +# memory: 100Mi +# requests: +# cpu: 50m +# memory: 50Mi + securityContext: {} + # skipTlsVerify Set to true to skip tls verification for kube api calls + # skipTlsVerify: true + enableUniqueFilenames: false + readinessProbe: {} + livenessProbe: {} + # Log level default for all sidecars. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. Defaults to INFO + # logLevel: INFO + alerts: + enabled: false + # Additional environment variables for the alerts sidecar + env: {} + # Do not reprocess already processed unchanged resources on k8s API reconnect. + # ignoreAlreadyProcessed: true + # label that the configmaps with alert are marked with + label: grafana_alert + # value of label that the configmaps with alert are set to + labelValue: "" + # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. + # logLevel: INFO + # If specified, the sidecar will search for alert config-maps inside this namespace. + # Otherwise the namespace in which the sidecar is running will be used. + # It's also possible to specify ALL to search in all namespaces + searchNamespace: null + # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. + watchMethod: WATCH + # search in configmap, secret or both + resource: both + # watchServerTimeout: request to the server, asking it to cleanly close the connection after that. + # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S + # watchServerTimeout: 3600 + # + # watchClientTimeout: is a client-side timeout, configuring your local socket. + # If you have a network outage dropping all packets with no RST/FIN, + # this is how long your client waits before realizing & dropping the connection. + # defaults to 66sec (sic!) + # watchClientTimeout: 60 + # + # Endpoint to send request to reload alerts + reloadURL: "http://localhost:3000/api/admin/provisioning/alerting/reload" + # Absolute path to shell script to execute after a alert got reloaded + script: null + skipReload: false + # This is needed if skipReload is true, to load any alerts defined at startup time. + # Deploy the alert sidecar as an initContainer. + initAlerts: false + # Additional alert sidecar volume mounts + extraMounts: [] + # Sets the size limit of the alert sidecar emptyDir volume + sizeLimit: {} + dashboards: + enabled: false + # Additional environment variables for the dashboards sidecar + env: {} + # Do not reprocess already processed unchanged resources on k8s API reconnect. + # ignoreAlreadyProcessed: true + SCProvider: true + # label that the configmaps with dashboards are marked with + label: grafana_dashboard + # value of label that the configmaps with dashboards are set to + labelValue: "" + # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. + # logLevel: INFO + # folder in the pod that should hold the collected dashboards (unless `defaultFolderName` is set) + folder: /tmp/dashboards + # The default folder name, it will create a subfolder under the `folder` and put dashboards in there instead + defaultFolderName: null + # Namespaces list. If specified, the sidecar will search for config-maps/secrets inside these namespaces. + # Otherwise the namespace in which the sidecar is running will be used. + # It's also possible to specify ALL to search in all namespaces. + searchNamespace: null + # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. + watchMethod: WATCH + # search in configmap, secret or both + resource: both + # If specified, the sidecar will look for annotation with this name to create folder and put graph here. + # You can use this parameter together with `provider.foldersFromFilesStructure`to annotate configmaps and create folder structure. + folderAnnotation: null + # Endpoint to send request to reload alerts + reloadURL: "http://localhost:3000/api/admin/provisioning/dashboards/reload" + # Absolute path to shell script to execute after a configmap got reloaded + script: null + skipReload: false + # watchServerTimeout: request to the server, asking it to cleanly close the connection after that. + # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S + # watchServerTimeout: 3600 + # + # watchClientTimeout: is a client-side timeout, configuring your local socket. + # If you have a network outage dropping all packets with no RST/FIN, + # this is how long your client waits before realizing & dropping the connection. + # defaults to 66sec (sic!) + # watchClientTimeout: 60 + # + # provider configuration that lets grafana manage the dashboards + provider: + # name of the provider, should be unique + name: sidecarProvider + # orgid as configured in grafana + orgid: 1 + # folder in which the dashboards should be imported in grafana + folder: '' + # type of the provider + type: file + # disableDelete to activate a import-only behaviour + disableDelete: false + # allow updating provisioned dashboards from the UI + allowUiUpdates: false + # allow Grafana to replicate dashboard structure from filesystem + foldersFromFilesStructure: false + # Additional dashboard sidecar volume mounts + extraMounts: [] + # Sets the size limit of the dashboard sidecar emptyDir volume + sizeLimit: {} + datasources: + enabled: false + # Additional environment variables for the datasourcessidecar + env: {} + envValueFrom: {} + # Do not reprocess already processed unchanged resources on k8s API reconnect. + # ignoreAlreadyProcessed: true + # label that the configmaps with datasources are marked with + label: grafana_datasource + # value of label that the configmaps with datasources are set to + labelValue: "" + # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. + # logLevel: INFO + # If specified, the sidecar will search for datasource config-maps inside this namespace. + # Otherwise the namespace in which the sidecar is running will be used. + # It's also possible to specify ALL to search in all namespaces + searchNamespace: null + # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. + watchMethod: WATCH + # search in configmap, secret or both + resource: both + # watchServerTimeout: request to the server, asking it to cleanly close the connection after that. + # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S + # watchServerTimeout: 3600 + # + # watchClientTimeout: is a client-side timeout, configuring your local socket. + # If you have a network outage dropping all packets with no RST/FIN, + # this is how long your client waits before realizing & dropping the connection. + # defaults to 66sec (sic!) + # watchClientTimeout: 60 + # + # Endpoint to send request to reload datasources + reloadURL: "http://localhost:3000/api/admin/provisioning/datasources/reload" + # Absolute path to shell script to execute after a datasource got reloaded + script: null + skipReload: true + # This is needed if skipReload is true, to load any datasources defined at startup time. + # Deploy the datasources sidecar as an initContainer. + initDatasources: true + # Sets the size limit of the datasource sidecar emptyDir volume + sizeLimit: {} + plugins: + enabled: false + # Additional environment variables for the plugins sidecar + env: {} + # Do not reprocess already processed unchanged resources on k8s API reconnect. + # ignoreAlreadyProcessed: true + # label that the configmaps with plugins are marked with + label: grafana_plugin + # value of label that the configmaps with plugins are set to + labelValue: "" + # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. + # logLevel: INFO + # If specified, the sidecar will search for plugin config-maps inside this namespace. + # Otherwise the namespace in which the sidecar is running will be used. + # It's also possible to specify ALL to search in all namespaces + searchNamespace: null + # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. + watchMethod: WATCH + # search in configmap, secret or both + resource: both + # watchServerTimeout: request to the server, asking it to cleanly close the connection after that. + # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S + # watchServerTimeout: 3600 + # + # watchClientTimeout: is a client-side timeout, configuring your local socket. + # If you have a network outage dropping all packets with no RST/FIN, + # this is how long your client waits before realizing & dropping the connection. + # defaults to 66sec (sic!) + # watchClientTimeout: 60 + # + # Endpoint to send request to reload plugins + reloadURL: "http://localhost:3000/api/admin/provisioning/plugins/reload" + # Absolute path to shell script to execute after a plugin got reloaded + script: null + skipReload: false + # Deploy the datasource sidecar as an initContainer in addition to a container. + # This is needed if skipReload is true, to load any plugins defined at startup time. + initPlugins: false + # Sets the size limit of the plugin sidecar emptyDir volume + sizeLimit: {} + notifiers: + enabled: false + # Additional environment variables for the notifierssidecar + env: {} + # Do not reprocess already processed unchanged resources on k8s API reconnect. + # ignoreAlreadyProcessed: true + # label that the configmaps with notifiers are marked with + label: grafana_notifier + # value of label that the configmaps with notifiers are set to + labelValue: "" + # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. + # logLevel: INFO + # If specified, the sidecar will search for notifier config-maps inside this namespace. + # Otherwise the namespace in which the sidecar is running will be used. + # It's also possible to specify ALL to search in all namespaces + searchNamespace: null + # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds. + watchMethod: WATCH + # search in configmap, secret or both + resource: both + # watchServerTimeout: request to the server, asking it to cleanly close the connection after that. + # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S + # watchServerTimeout: 3600 + # + # watchClientTimeout: is a client-side timeout, configuring your local socket. + # If you have a network outage dropping all packets with no RST/FIN, + # this is how long your client waits before realizing & dropping the connection. + # defaults to 66sec (sic!) + # watchClientTimeout: 60 + # + # Endpoint to send request to reload notifiers + reloadURL: "http://localhost:3000/api/admin/provisioning/notifications/reload" + # Absolute path to shell script to execute after a notifier got reloaded + script: null + skipReload: false + # Deploy the notifier sidecar as an initContainer in addition to a container. + # This is needed if skipReload is true, to load any notifiers defined at startup time. + initNotifiers: false + # Sets the size limit of the notifier sidecar emptyDir volume + sizeLimit: {} + +## Override the deployment namespace +## +namespaceOverride: "" + +## Number of old ReplicaSets to retain +## +revisionHistoryLimit: 10 + +## Add a seperate remote image renderer deployment/service +imageRenderer: + deploymentStrategy: {} + # Enable the image-renderer deployment & service + enabled: false + replicas: 1 + autoscaling: + enabled: false + minReplicas: 1 + maxReplicas: 5 + targetCPU: "60" + targetMemory: "" + behavior: {} + image: + # image-renderer Image repository + repository: rancher/mirrored-grafana-grafana-image-renderer + # image-renderer Image tag + tag: 3.10.1 + # image-renderer Image sha (optional) + sha: "" + # image-renderer ImagePullPolicy + pullPolicy: Always + # extra environment variables + env: + HTTP_HOST: "0.0.0.0" + # RENDERING_ARGS: --no-sandbox,--disable-gpu,--window-size=1280x758 + # RENDERING_MODE: clustered + # IGNORE_HTTPS_ERRORS: true + + ## "valueFrom" environment variable references that will be added to deployment pods. Name is templated. + ## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#envvarsource-v1-core + ## Renders in container spec as: + ## env: + ## ... + ## - name: + ## valueFrom: + ## + envValueFrom: {} + # ENV_NAME: + # configMapKeyRef: + # name: configmap-name + # key: value_key + + # image-renderer deployment serviceAccount + serviceAccountName: "" + # image-renderer deployment securityContext + securityContext: {} + # image-renderer deployment container securityContext + containerSecurityContext: + seccompProfile: + type: RuntimeDefault + capabilities: + drop: ['ALL'] + allowPrivilegeEscalation: false + readOnlyRootFilesystem: true + ## image-renderer pod annotation + podAnnotations: {} + # image-renderer deployment Host Aliases + hostAliases: [] + # image-renderer deployment priority class + priorityClassName: '' + service: + # Enable the image-renderer service + enabled: true + # image-renderer service port name + portName: 'http' + # image-renderer service port used by both service and deployment + port: 8081 + targetPort: 8081 + # Adds the appProtocol field to the image-renderer service. This allows to work with istio protocol selection. Ex: "http" or "tcp" + appProtocol: "" + serviceMonitor: + ## If true, a ServiceMonitor CRD is created for a prometheus operator + ## https://github.com/coreos/prometheus-operator + ## + enabled: false + path: /metrics + # namespace: monitoring (defaults to use the namespace this chart is deployed to) + labels: {} + interval: 1m + scheme: http + tlsConfig: {} + scrapeTimeout: 30s + relabelings: [] + # See: https://doc.crds.dev/github.com/prometheus-operator/kube-prometheus/monitoring.coreos.com/ServiceMonitor/v1@v0.11.0#spec-targetLabels + targetLabels: [] + # - targetLabel1 + # - targetLabel2 + # If https is enabled in Grafana, this needs to be set as 'https' to correctly configure the callback used in Grafana + grafanaProtocol: http + # In case a sub_path is used this needs to be added to the image renderer callback + grafanaSubPath: "" + # name of the image-renderer port on the pod + podPortName: http + # number of image-renderer replica sets to keep + revisionHistoryLimit: 10 + networkPolicy: + # Enable a NetworkPolicy to limit inbound traffic to only the created grafana pods + limitIngress: true + # Enable a NetworkPolicy to limit outbound traffic to only the created grafana pods + limitEgress: false + # Allow additional services to access image-renderer (eg. Prometheus operator when ServiceMonitor is enabled) + extraIngressSelectors: [] + resources: {} +# limits: +# cpu: 100m +# memory: 100Mi +# requests: +# cpu: 50m +# memory: 50Mi + ## Node labels for pod assignment + ## ref: https://kubernetes.io/docs/user-guide/node-selection/ + # + nodeSelector: {} + + ## Tolerations for pod assignment + ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ + ## + tolerations: [] + + ## Affinity for pod assignment (evaluated as template) + ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity + ## + affinity: {} + + ## Use an alternate scheduler, e.g. "stork". + ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/ + ## + # schedulerName: "default-scheduler" + +networkPolicy: + ## @param networkPolicy.enabled Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now. + ## + enabled: false + ## @param networkPolicy.allowExternal Don't require client label for connections + ## The Policy model to apply. When set to false, only pods with the correct + ## client label will have network access to grafana port defined. + ## When true, grafana will accept connections from any source + ## (with the correct destination port). + ## + ingress: true + ## @param networkPolicy.ingress When true enables the creation + ## an ingress network policy + ## + allowExternal: true + ## @param networkPolicy.explicitNamespacesSelector A Kubernetes LabelSelector to explicitly select namespaces from which traffic could be allowed + ## If explicitNamespacesSelector is missing or set to {}, only client Pods that are in the networkPolicy's namespace + ## and that match other criteria, the ones that have the good label, can reach the grafana. + ## But sometimes, we want the grafana to be accessible to clients from other namespaces, in this case, we can use this + ## LabelSelector to select these namespaces, note that the networkPolicy's namespace should also be explicitly added. + ## + ## Example: + ## explicitNamespacesSelector: + ## matchLabels: + ## role: frontend + ## matchExpressions: + ## - {key: role, operator: In, values: [frontend]} + ## + explicitNamespacesSelector: {} + ## + ## + ## + ## + ## + ## + egress: + ## @param networkPolicy.egress.enabled When enabled, an egress network policy will be + ## created allowing grafana to connect to external data sources from kubernetes cluster. + enabled: false + ## + ## @param networkPolicy.egress.blockDNSResolution When enabled, DNS resolution will be blocked + ## for all pods in the grafana namespace. + blockDNSResolution: false + ## + ## @param networkPolicy.egress.ports Add individual ports to be allowed by the egress + ports: [] + ## Add ports to the egress by specifying - port: + ## E.X. + ## - port: 80 + ## - port: 443 + ## + ## @param networkPolicy.egress.to Allow egress traffic to specific destinations + to: [] + ## Add destinations to the egress by specifying - ipBlock: + ## E.X. + ## to: + ## - namespaceSelector: + ## matchExpressions: + ## - {key: role, operator: In, values: [grafana]} + ## + ## + ## + ## + ## + +# Enable backward compatibility of kubernetes where version below 1.13 doesn't have the enableServiceLinks option +enableKubeBackwardCompatibility: false +useStatefulSet: false +# Create a dynamic manifests via values: +extraObjects: [] + # - apiVersion: "kubernetes-client.io/v1" + # kind: ExternalSecret + # metadata: + # name: grafana-secrets + # spec: + # backendType: gcpSecretsManager + # data: + # - key: grafana-admin-password + # name: adminPassword + +# assertNoLeakedSecrets is a helper function defined in _helpers.tpl that checks if secret +# values are not exposed in the rendered grafana.ini configmap. It is enabled by default. +# +# To pass values into grafana.ini without exposing them in a configmap, use variable expansion: +# https://grafana.com/docs/grafana/latest/setup-grafana/configure-grafana/#variable-expansion +# +# Alternatively, if you wish to allow secret values to be exposed in the rendered grafana.ini configmap, +# you can disable this check by setting assertNoLeakedSecrets to false. +assertNoLeakedSecrets: true diff --git a/charts/rancher-project-monitoring/0.4.2/files/federate/federate-scrape-config.yaml b/charts/rancher-project-monitoring/0.4.2/files/federate/federate-scrape-config.yaml new file mode 100644 index 00000000..b5ceb698 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/federate/federate-scrape-config.yaml @@ -0,0 +1,13 @@ +- job_name: 'federate' + scrape_interval: {{ .Values.federate.interval }} + honor_labels: true + metrics_path: '/federate' + + params: + 'match[]': + - '{namespace=~"{{ include "project-prometheus-stack.projectNamespaceList" . | replace "," "|" }}", job=~"kube-state-metrics|kubelet|k3s-server"}' + - '{namespace=~"{{ include "project-prometheus-stack.projectNamespaceList" . | replace "," "|" }}", job=""}' + + static_configs: + - targets: {{ .Values.federate.targets | toYaml | nindent 6 }} + diff --git a/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/nginx.json b/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/nginx.json new file mode 100644 index 00000000..56535223 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/nginx.json @@ -0,0 +1,1445 @@ +{ + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + }, + { + "datasource": "$datasource", + "enable": true, + "expr": "sum(changes(nginx_ingress_controller_config_last_reload_successful_timestamp_seconds{instance!=\"unknown\",controller_class=~\"$controller_class\",namespace=~\"$namespace\"}[30s])) by (controller_class)", + "hide": false, + "iconColor": "rgba(255, 96, 96, 1)", + "limit": 100, + "name": "Config Reloads", + "showIn": 0, + "step": "30s", + "tagKeys": "controller_class", + "tags": [], + "titleFormat": "Config Reloaded", + "type": "tags" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "iteration": 1534359654832, + "links": [], + "panels": [ + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "datasource": "$datasource", + "format": "ops", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 6, + "x": 0, + "y": 0 + }, + "id": 20, + "interval": null, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": true, + "lineColor": "rgb(31, 120, 193)", + "show": true + }, + "tableColumn": "", + "targets": [ + { + "expr": "round(sum(irate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\"}[2m])), 0.001)", + "format": "time_series", + "intervalFactor": 1, + "refId": "A", + "step": 4 + } + ], + "thresholds": "", + "title": "Controller Request Volume", + "transparent": false, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "avg" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "datasource": "$datasource", + "format": "none", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 6, + "x": 6, + "y": 0 + }, + "id": 82, + "interval": null, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": true, + "lineColor": "rgb(31, 120, 193)", + "show": true + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(avg_over_time(nginx_ingress_controller_nginx_process_connections{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",state=\"active\"}[2m]))", + "format": "time_series", + "instant": false, + "intervalFactor": 1, + "refId": "A", + "step": 4 + } + ], + "thresholds": "", + "title": "Controller Connections", + "transparent": false, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "avg" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "datasource": "$datasource", + "format": "percentunit", + "gauge": { + "maxValue": 100, + "minValue": 80, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": false + }, + "gridPos": { + "h": 3, + "w": 6, + "x": 12, + "y": 0 + }, + "id": 21, + "interval": null, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": true, + "lineColor": "rgb(31, 120, 193)", + "show": true + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",status!~\"[4-5].*\"}[2m])) / sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\"}[2m]))", + "format": "time_series", + "intervalFactor": 1, + "refId": "A", + "step": 4 + } + ], + "thresholds": "95, 99, 99.5", + "title": "Controller Success Rate (non-4|5xx responses)", + "transparent": false, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "avg" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "none", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 3, + "x": 18, + "y": 0 + }, + "id": 81, + "interval": null, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": true, + "lineColor": "rgb(31, 120, 193)", + "show": true + }, + "tableColumn": "", + "targets": [ + { + "expr": "avg(irate(nginx_ingress_controller_success{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[1m])) * 60", + "format": "time_series", + "instant": false, + "intervalFactor": 1, + "refId": "A", + "step": 4 + } + ], + "thresholds": "", + "title": "Config Reloads", + "transparent": false, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "total" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "none", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 3, + "x": 21, + "y": 0 + }, + "id": 83, + "interval": null, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": true, + "lineColor": "rgb(31, 120, 193)", + "show": true + }, + "tableColumn": "", + "targets": [ + { + "expr": "count(nginx_ingress_controller_config_last_reload_successful{controller_pod=~\"$controller\",controller_namespace=~\"$namespace\"} == 0)", + "format": "time_series", + "instant": true, + "intervalFactor": 1, + "refId": "A", + "step": 4 + } + ], + "thresholds": "", + "title": "Last Config Failed", + "transparent": false, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "avg" + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 2, + "editable": true, + "error": false, + "fill": 1, + "grid": {}, + "gridPos": { + "h": 7, + "w": 12, + "x": 0, + "y": 3 + }, + "height": "200px", + "id": 86, + "isNew": true, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": false, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": 300, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "repeatDirection": "h", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "round(sum(irate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress), 0.001)", + "format": "time_series", + "hide": false, + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "metric": "network", + "refId": "A", + "step": 10 + } + ], + "thresholds": [], + "timeFrom": null, + "timeShift": null, + "title": "Ingress Request Volume", + "tooltip": { + "msResolution": false, + "shared": true, + "sort": 2, + "value_type": "cumulative" + }, + "transparent": false, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "reqps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "max - istio-proxy": "#890f02", + "max - master": "#bf1b00", + "max - prometheus": "#bf1b00" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 2, + "editable": false, + "error": false, + "fill": 0, + "grid": {}, + "gridPos": { + "h": 7, + "w": 12, + "x": 12, + "y": 3 + }, + "id": 87, + "isNew": true, + "legend": { + "alignAsTable": true, + "avg": true, + "current": false, + "hideEmpty": true, + "hideZero": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": 300, + "sort": "avg", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",ingress=~\"$ingress\",status!~\"[4-5].*\"}[2m])) by (ingress) / sum(rate(nginx_ingress_controller_requests{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress)", + "format": "time_series", + "instant": false, + "interval": "10s", + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "metric": "container_memory_usage:sort_desc", + "refId": "A", + "step": 10 + } + ], + "thresholds": [], + "timeFrom": null, + "timeShift": null, + "title": "Ingress Success Rate (non-4|5xx responses)", + "tooltip": { + "msResolution": false, + "shared": true, + "sort": 1, + "value_type": "cumulative" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "percentunit", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 2, + "editable": true, + "error": false, + "fill": 1, + "grid": {}, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 10 + }, + "height": "200px", + "id": 32, + "isNew": true, + "legend": { + "alignAsTable": false, + "avg": true, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": 200, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum (irate (nginx_ingress_controller_request_size_sum{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", + "format": "time_series", + "instant": false, + "interval": "10s", + "intervalFactor": 1, + "legendFormat": "Received", + "metric": "network", + "refId": "A", + "step": 10 + }, + { + "expr": "- sum (irate (nginx_ingress_controller_response_size_sum{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m]))", + "format": "time_series", + "hide": false, + "interval": "10s", + "intervalFactor": 1, + "legendFormat": "Sent", + "metric": "network", + "refId": "B", + "step": 10 + } + ], + "thresholds": [], + "timeFrom": null, + "timeShift": null, + "title": "Network I/O pressure", + "tooltip": { + "msResolution": false, + "shared": true, + "sort": 0, + "value_type": "cumulative" + }, + "transparent": false, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "max - istio-proxy": "#890f02", + "max - master": "#bf1b00", + "max - prometheus": "#bf1b00" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 2, + "editable": false, + "error": false, + "fill": 0, + "grid": {}, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 10 + }, + "id": 77, + "isNew": true, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": 200, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "avg(nginx_ingress_controller_nginx_process_resident_memory_bytes{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}) ", + "format": "time_series", + "instant": false, + "interval": "10s", + "intervalFactor": 1, + "legendFormat": "nginx", + "metric": "container_memory_usage:sort_desc", + "refId": "A", + "step": 10 + } + ], + "thresholds": [], + "timeFrom": null, + "timeShift": null, + "title": "Average Memory Usage", + "tooltip": { + "msResolution": false, + "shared": true, + "sort": 2, + "value_type": "cumulative" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "max - istio-proxy": "#890f02", + "max - master": "#bf1b00" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 3, + "editable": false, + "error": false, + "fill": 0, + "grid": {}, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 10 + }, + "height": "", + "id": 79, + "isNew": true, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sort": null, + "sortDesc": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "avg (rate (nginx_ingress_controller_nginx_process_cpu_seconds_total{controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\"}[2m])) ", + "format": "time_series", + "interval": "10s", + "intervalFactor": 1, + "legendFormat": "nginx", + "metric": "container_cpu", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + { + "colorMode": "critical", + "fill": true, + "line": true, + "op": "gt" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Average CPU Usage", + "tooltip": { + "msResolution": true, + "shared": true, + "sort": 2, + "value_type": "cumulative" + }, + "transparent": false, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "none", + "label": "cores", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "columns": [], + "datasource": "$datasource", + "fontSize": "100%", + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 16 + }, + "hideTimeOverride": false, + "id": 75, + "links": [], + "pageSize": 7, + "repeat": null, + "repeatDirection": "h", + "scroll": true, + "showHeader": true, + "sort": { + "col": 1, + "desc": true + }, + "styles": [ + { + "alias": "Ingress", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "ingress", + "preserveFormat": false, + "sanitize": false, + "thresholds": [], + "type": "string", + "unit": "short" + }, + { + "alias": "Requests", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "Value #A", + "thresholds": [ + "" + ], + "type": "number", + "unit": "ops" + }, + { + "alias": "Errors", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "Value #B", + "thresholds": [], + "type": "number", + "unit": "ops" + }, + { + "alias": "P50 Latency", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": false, + "pattern": "Value #C", + "thresholds": [], + "type": "number", + "unit": "dtdurations" + }, + { + "alias": "P90 Latency", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "pattern": "Value #D", + "thresholds": [], + "type": "number", + "unit": "dtdurations" + }, + { + "alias": "P99 Latency", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "pattern": "Value #E", + "thresholds": [], + "type": "number", + "unit": "dtdurations" + }, + { + "alias": "IN", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "Value #F", + "thresholds": [ + "" + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "Time", + "thresholds": [], + "type": "hidden", + "unit": "short" + }, + { + "alias": "OUT", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "mappingType": 1, + "pattern": "Value #G", + "thresholds": [], + "type": "number", + "unit": "Bps" + } + ], + "targets": [ + { + "expr": "histogram_quantile(0.50, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", + "format": "table", + "hide": false, + "instant": true, + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.90, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", + "format": "table", + "hide": false, + "instant": true, + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "refId": "D" + }, + { + "expr": "histogram_quantile(0.99, sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (le, ingress))", + "format": "table", + "hide": false, + "instant": true, + "intervalFactor": 1, + "legendFormat": "{{ destination_service }}", + "refId": "E" + }, + { + "expr": "sum(irate(nginx_ingress_controller_request_size_sum{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress)", + "format": "table", + "hide": false, + "instant": true, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "refId": "F" + }, + { + "expr": "sum(irate(nginx_ingress_controller_response_size_sum{ingress!=\"\",controller_pod=~\"$controller\",controller_class=~\"$controller_class\",controller_namespace=~\"$namespace\",ingress=~\"$ingress\"}[2m])) by (ingress)", + "format": "table", + "instant": true, + "intervalFactor": 1, + "legendFormat": "{{ ingress }}", + "refId": "G" + } + ], + "timeFrom": null, + "title": "Ingress Percentile Response Times and Transfer Rates", + "transform": "table", + "transparent": false, + "type": "table" + }, + { + "columns": [ + { + "text": "Current", + "value": "current" + } + ], + "datasource": "$datasource", + "fontSize": "100%", + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 24 + }, + "height": "1024", + "id": 85, + "links": [], + "pageSize": 7, + "scroll": true, + "showHeader": true, + "sort": { + "col": 1, + "desc": false + }, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "date" + }, + { + "alias": "TTL", + "colorMode": "cell", + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "pattern": "Current", + "thresholds": [ + "0", + "691200" + ], + "type": "number", + "unit": "s" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + "rgba(245, 54, 54, 0.9)", + "rgba(237, 129, 40, 0.89)", + "rgba(50, 172, 45, 0.97)" + ], + "decimals": 2, + "pattern": "/.*/", + "thresholds": [], + "type": "number", + "unit": "short" + } + ], + "targets": [ + { + "expr": "avg(nginx_ingress_controller_ssl_expire_time_seconds{kubernetes_pod_name=~\"$controller\",namespace=~\"$namespace\",ingress=~\"$ingress\"}) by (host) - time()", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{ host }}", + "metric": "gke_letsencrypt_cert_expiration", + "refId": "A", + "step": 1 + } + ], + "title": "Ingress Certificate Expiry", + "transform": "timeseries_aggregations", + "type": "table" + } + ], + "refresh": "5s", + "schemaVersion": 16, + "style": "dark", + "tags": [ + "nginx" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".*", + "current": { + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "Namespace", + "multi": false, + "name": "namespace", + "options": [], + "query": "label_values(nginx_ingress_controller_config_hash, controller_namespace)", + "refresh": 1, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".*", + "current": { + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "Controller Class", + "multi": false, + "name": "controller_class", + "options": [], + "query": "label_values(nginx_ingress_controller_config_hash{namespace=~\"$namespace\"}, controller_class) ", + "refresh": 1, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".*", + "current": { + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "Controller", + "multi": false, + "name": "controller", + "options": [], + "query": "label_values(nginx_ingress_controller_config_hash{namespace=~\"$namespace\",controller_class=~\"$controller_class\"}, controller_pod) ", + "refresh": 1, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".*", + "current": { + "tags": [], + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "Ingress", + "multi": false, + "name": "ingress", + "options": [], + "query": "label_values(nginx_ingress_controller_requests{namespace=~\"$namespace\",controller_class=~\"$controller_class\",controller_pod=~\"$controller\"}, ingress) ", + "refresh": 1, + "regex": "", + "sort": 2, + "tagValuesQuery": "", + "tags": [], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "2m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "browser", + "title": "NGINX / Ingress Controller", + "uid": "nginx", + "version": 1 +} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/request-handling-performance.json b/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/request-handling-performance.json new file mode 100644 index 00000000..156e3312 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/ingress-nginx/request-handling-performance.json @@ -0,0 +1,963 @@ +{ + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "description": "", + "editable": true, + "gnetId": 9614, + "graphTooltip": 1, + "id": null, + "iteration": 1582146566338, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "Total time taken for nginx and upstream servers to process a request and send a response", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 91, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(\n 0.5,\n sum by (le)(\n rate(\n nginx_ingress_controller_request_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "legendFormat": ".5", + "refId": "D" + }, + { + "expr": "histogram_quantile(\n 0.95,\n sum by (le)(\n rate(\n nginx_ingress_controller_request_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "legendFormat": ".95", + "refId": "B" + }, + { + "expr": "histogram_quantile(\n 0.99,\n sum by (le)(\n rate(\n nginx_ingress_controller_request_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "legendFormat": ".99", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Total request handling time", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "The time spent on receiving the response from the upstream server", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 0 + }, + "hiddenSeries": false, + "id": 94, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(\n 0.5,\n sum by (le)(\n rate(\n nginx_ingress_controller_response_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": ".5", + "refId": "D" + }, + { + "expr": "histogram_quantile(\n 0.95,\n sum by (le)(\n rate(\n nginx_ingress_controller_response_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "legendFormat": ".95", + "refId": "B" + }, + { + "expr": "histogram_quantile(\n 0.99,\n sum by (le)(\n rate(\n nginx_ingress_controller_response_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "legendFormat": ".99", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Upstream response time", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 8 + }, + "hiddenSeries": false, + "id": 93, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": " sum by (path)(\n rate(\n nginx_ingress_controller_request_duration_seconds_count{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Request volume by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "reqps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "For each path observed, its median upstream response time", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 8 + }, + "hiddenSeries": false, + "id": 98, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(\n .5,\n sum by (le, path)(\n rate(\n nginx_ingress_controller_response_duration_seconds_bucket{\n ingress =~ \"$ingress\"\n }[1m]\n )\n )\n)", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Median upstream response time by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "Percentage of 4xx and 5xx responses among all responses.", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 16 + }, + "hiddenSeries": false, + "id": 100, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null as zero", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum by (path) (rate(nginx_ingress_controller_request_duration_seconds_count{\n ingress =~ \"$ingress\",\n status =~ \"[4-5].*\"\n}[1m])) / sum by (path) (rate(nginx_ingress_controller_request_duration_seconds_count{\n ingress =~ \"$ingress\",\n}[1m]))", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Response error rate by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "percentunit", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "For each path observed, the sum of upstream request time", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 16 + }, + "hiddenSeries": false, + "id": 102, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum by (path) (rate(nginx_ingress_controller_response_duration_seconds_sum{ingress =~ \"$ingress\"}[1m]))", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Upstream time consumed by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 24 + }, + "hiddenSeries": false, + "id": 101, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": " sum (\n rate(\n nginx_ingress_controller_request_duration_seconds_count{\n ingress =~ \"$ingress\",\n status =~\"[4-5].*\",\n }[1m]\n )\n ) by(path, status)\n", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }} {{ status }}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Response error volume by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "reqps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 24 + }, + "hiddenSeries": false, + "id": 99, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum (\n rate (\n nginx_ingress_controller_response_size_sum {\n ingress =~ \"$ingress\",\n }[1m]\n )\n) by (path) / sum (\n rate(\n nginx_ingress_controller_response_size_count {\n ingress =~ \"$ingress\",\n }[1m]\n )\n) by (path)\n", + "hide": false, + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ path }}", + "refId": "D" + }, + { + "expr": " sum (rate(nginx_ingress_controller_response_size_bucket{\n ingress =~ \"$ingress\",\n }[1m])) by (le)\n", + "hide": true, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Average response size by Path", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "decbytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 32 + }, + "hiddenSeries": false, + "id": 96, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "dataLinks": [] + }, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum (\n rate(\n nginx_ingress_controller_ingress_upstream_latency_seconds_sum {\n ingress =~ \"$ingress\",\n }[1m]\n)) / sum (\n rate(\n nginx_ingress_controller_ingress_upstream_latency_seconds_count {\n ingress =~ \"$ingress\",\n }[1m]\n )\n)\n", + "hide": false, + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "average", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Upstream service latency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": "30s", + "schemaVersion": 22, + "style": "dark", + "tags": [ + "nginx" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".*", + "current": {}, + "datasource": "$datasource", + "definition": "label_values(nginx_ingress_controller_requests, ingress) ", + "hide": 0, + "includeAll": true, + "label": "Service Ingress", + "multi": false, + "name": "ingress", + "options": [], + "query": "label_values(nginx_ingress_controller_requests, ingress) ", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "sort": 2, + "tagValuesQuery": "", + "tags": [], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-15m", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "2m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "browser", + "title": "NGINX / Request Handling Performance", + "uid": "4GFbkOsZk", + "version": 1 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster-nodes.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster-nodes.json new file mode 100644 index 00000000..1d494350 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster-nodes.json @@ -0,0 +1,793 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "links": [], + "panels": [ + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\",mode=\"idle\"}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "{{instance}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m] ({{instance}})" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_load5 OR avg_over_time(windows_system_processor_queue_length[5m])) by (instance)", + "interval": "", + "legendFormat": "Load[5m] ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(node_load1 OR avg_over_time(windows_system_processor_queue_length[1m])) by (instance)", + "interval": "", + "legendFormat": "Load[1m] ({{instance}})", + "refId": "B" + }, + { + "expr": "sum(node_load15 OR avg_over_time(windows_system_processor_queue_length[15m])) by (instance)", + "interval": "", + "legendFormat": "Load[15m] ({{instance}})", + "refId": "C" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Load Average", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - sum(node_memory_MemAvailable_bytes OR windows_os_physical_memory_free_bytes) by (instance) / sum(node_memory_MemTotal_bytes OR windows_cs_physical_memory_bytes) by (instance) ", + "interval": "", + "legendFormat": "{{instance}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"} OR windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) by (instance) / sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"} OR windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) by (instance))", + "interval": "", + "legendFormat": "{{instance}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_disk_read_bytes_total[$__rate_interval]) OR rate(windows_logical_disk_read_bytes_total[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Read ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_disk_written_bytes_total[$__rate_interval]) OR rate(windows_logical_disk_write_bytes_total[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Write ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_receive_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_received_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Receive Errors ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Receive Total ({{instance}})", + "refId": "B" + }, + { + "expr": "sum(rate(node_network_transmit_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_outbound_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Transmit Errors ({{instance}})", + "refId": "C" + }, + { + "expr": "sum(rate(node_network_receive_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_received_discarded_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Receive Dropped ({{instance}})", + "refId": "D" + }, + { + "expr": "sum(rate(node_network_transmit_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_outbound_discarded{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Transmit Dropped ({{instance}})", + "refId": "E" + }, + { + "expr": "sum(rate(node_network_transmit_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) by (instance) OR sum(rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Transmit Total ({{instance}})", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_transmit_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval]) OR rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Transmit Total ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval]) OR rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Receive Total ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Cluster (Nodes)", + "uid": "rancher-cluster-nodes-1", + "version": 3 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster.json new file mode 100644 index 00000000..24385a23 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/cluster/rancher-cluster.json @@ -0,0 +1,776 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\",mode=\"idle\"}[$__rate_interval]))", + "legendFormat": "Total", + "interval": "", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_load5 OR avg_over_time(windows_system_processor_queue_length[5m]))", + "interval": "", + "legendFormat": "Load[5m]", + "refId": "A" + }, + { + "expr": "sum(node_load1 OR avg_over_time(windows_system_processor_queue_length[1m]))", + "interval": "", + "legendFormat": "Load[1m]", + "refId": "B" + }, + { + "expr": "sum(node_load15 OR avg_over_time(windows_system_processor_queue_length[15m]))", + "interval": "", + "legendFormat": "Load[15m]", + "refId": "C" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Load Average", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - sum(node_memory_MemAvailable_bytes OR windows_os_physical_memory_free_bytes) / sum(node_memory_MemTotal_bytes OR windows_cs_physical_memory_bytes)", + "legendFormat": "Total", + "interval": "", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"} OR windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) / sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"} OR windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}))", + "legendFormat": "Total", + "interval": "", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_disk_read_bytes_total[$__rate_interval]) OR rate(windows_logical_disk_read_bytes_total[$__rate_interval]))", + "interval": "", + "legendFormat": "Read", + "refId": "A" + }, + { + "expr": "sum(rate(node_disk_written_bytes_total[$__rate_interval]) OR rate(windows_logical_disk_write_bytes_total[$__rate_interval]))", + "interval": "", + "legendFormat": "Write", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(rate(node_network_receive_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Errors", + "refId": "A" + }, + { + "expr": "(sum(rate(node_network_receive_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "B" + }, + { + "expr": "(sum(rate(node_network_transmit_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_outbound_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Errors", + "refId": "C" + }, + { + "expr": "(sum(rate(node_network_receive_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_discarded_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Dropped", + "refId": "D" + }, + { + "expr": "(sum(rate(node_network_transmit_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_outbound_discarded{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Dropped", + "refId": "E" + }, + { + "expr": "(sum(rate(node_network_transmit_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_transmit_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval]) OR rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\"}[$__rate_interval]) OR rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*'}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Cluster", + "uid": "rancher-cluster-1", + "version": 3 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundle.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundle.json new file mode 100644 index 00000000..698f48ae --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundle.json @@ -0,0 +1,246 @@ +{ + "description": "Bundle", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Bundles", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 0 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_not_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_out_of_sync{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Out of Sync" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_err_applied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Err Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_pending{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Pending" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_wait_applied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Bundles", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_not_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_out_of_sync{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Out of Sync" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_err_applied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Err Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_pending{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Pending" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundle_wait_applied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Bundles", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(fleet_bundle_desired_ready, exported_namespace)", + "refresh": 2, + "type": "query" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "includeAll": true, + "name": "name", + "query": "label_values(fleet_bundle_desired_ready{exported_namespace=~\"$namespace\"}, name)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / Bundle", + "uid": "fleet-bundle" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundledeployment.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundledeployment.json new file mode 100644 index 00000000..c81f7a62 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/bundledeployment.json @@ -0,0 +1,219 @@ +{ + "description": "BundleDeployment", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Ready\"}) / sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\"})" + } + ], + "title": "Ready BundleDeployments", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 0 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Ready\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"NotReady\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"WaitApplied\"})", + "legendFormat": "Wait Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"ErrApplied\"})", + "legendFormat": "Err Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"OutOfSync\"})", + "legendFormat": "OutOfSync" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Pending\"})", + "legendFormat": "Pending" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Modified\"})", + "legendFormat": "Modified" + } + ], + "title": "BundleDeployments", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Ready\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"NotReady\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"WaitApplied\"})", + "legendFormat": "Wait Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"ErrApplied\"})", + "legendFormat": "Err Applied" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"OutOfSync\"})", + "legendFormat": "OutOfSync" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Pending\"})", + "legendFormat": "Pending" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_bundledeployment_state{cluster_namespace=~\"$namespace\",state=\"Modified\"})", + "legendFormat": "Modified" + } + ], + "title": "BundleDeployments", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(fleet_bundledeployment_state, cluster_namespace)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / BundleDeployment", + "uid": "fleet-bundledeployment" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/cluster.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/cluster.json new file mode 100644 index 00000000..73bdea48 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/cluster.json @@ -0,0 +1,484 @@ +{ + "description": "Cluster", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_cluster_desired_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Git Repos", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 0 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_desired_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Git Repos", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_desired_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_ready_git_repos{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Git Repos", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 13 + }, + "id": 4, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_cluster_resources_count_desiredready{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 13 + }, + "id": 5, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_desiredready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_notready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_orphaned{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Orphaned" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_waitapplied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 6, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_desiredready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_notready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_orphaned{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Orphaned" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_resources_count_waitapplied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Resources", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 26 + }, + "id": 7, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"Ready\"}) / sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 26 + }, + "id": 8, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"Ready\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"NotReady\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"WaitCheckIn\"})", + "legendFormat": "Wait Check In" + } + ], + "title": "Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 34 + }, + "id": 9, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"Ready\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"NotReady\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_state{exported_namespace=\"$namespace\",name=~\"$name\",state=\"WaitCheckIn\"})", + "legendFormat": "Wait Check In" + } + ], + "title": "Clusters", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(fleet_cluster_desired_ready_git_repos, exported_namespace)", + "refresh": 2, + "type": "query" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "includeAll": true, + "name": "name", + "query": "label_values(fleet_cluster_desired_ready_git_repos{exported_namespace=~\"$namespace\"}, name)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / Cluster", + "uid": "fleet-cluster" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/clustergroup.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/clustergroup.json new file mode 100644 index 00000000..ce3df87b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/clustergroup.json @@ -0,0 +1,468 @@ +{ + "description": "ClusterGroup", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_cluster_group_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Bundles", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 0 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Bundles", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_bundle_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_bundle_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Bundles", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 13 + }, + "id": 4, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "(sum(fleet_cluster_group_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"}) - sum(fleet_cluster_group_non_ready_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})) / sum(fleet_cluster_group_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 13 + }, + "id": 5, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Total" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_non_ready_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Non Ready" + } + ], + "title": "Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 6, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Total" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_non_ready_cluster_count{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Non Ready" + } + ], + "title": "Clusters", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 26 + }, + "id": 7, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_cluster_group_resource_count_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 26 + }, + "id": 8, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_notready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_orphaned{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Orphaned" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_waitapplied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 34 + }, + "id": 9, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_notready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_orphaned{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Orphaned" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_cluster_group_resource_count_waitapplied{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Wait Applied" + } + ], + "title": "Resources", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(fleet_cluster_group_bundle_desired_ready, exported_namespace)", + "refresh": 2, + "type": "query" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "includeAll": true, + "name": "name", + "query": "label_values(fleet_cluster_group_bundle_desired_ready{exported_namespace=~\"$namespace\"}, name)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / ClusterGroup", + "uid": "fleet-cluster-group" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/controller-runtime.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/controller-runtime.json new file mode 100644 index 00000000..23a81f2a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/controller-runtime.json @@ -0,0 +1,454 @@ +{ + "description": "Controller Runtime", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "controller_runtime_active_workers{job=\"$job\", namespace=\"$namespace\"}", + "legendFormat": "{{controller}} {{instance}}" + } + ], + "title": "Number of Workers in Use", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(rate(controller_runtime_reconcile_errors_total{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, pod)", + "legendFormat": "{{instance}} {{pod}}" + } + ], + "title": "Reconciliation Error Count per Controller", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 16 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(rate(controller_runtime_reconcile_total{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, pod)", + "legendFormat": "{{instance}} {{pod}}" + } + ], + "title": "Total Reconciliation Count per Controller", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 24 + }, + "id": 4, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "workqueue_depth{job=\"$job\", namespace=\"$namespace\"}", + "legendFormat": "{{instance}} {{pod}}" + } + ], + "title": "WorkQueue Depth", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 32 + }, + "id": 5, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.50, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P50 {{name}}" + } + ], + "title": "Seconds for Items Stay in Queue (before being requested) P50", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 40 + }, + "id": 6, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.90, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P90 {{name}}" + } + ], + "title": "Seconds for Items Stay in Queue (before being requested) P90", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 48 + }, + "id": 7, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.99, sum(rate(workqueue_queue_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P99 {{name}}" + } + ], + "title": "Seconds for Items Stay in Queue (before being requested) P99", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 56 + }, + "id": 8, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(rate(workqueue_adds_total{job=\"$job\", namespace=\"$namespace\"}[2m])) by (instance, name)", + "legendFormat": "{{name}} {{instance}}" + } + ], + "title": "Work Queue Add Rate", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 64 + }, + "id": 9, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "rate(workqueue_unfinished_work_seconds{job=\"$job\", namespace=\"$namespace\"}[5m])", + "legendFormat": "{{name}} {{instance}}" + } + ], + "title": "Unfinished Seconds", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 72 + }, + "id": 10, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.50, sum(rate(workqueue_work_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P50 {{name}}" + } + ], + "title": "Seconds Processing Items from WorkQueue - 50th Percentile", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 80 + }, + "id": 11, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.90, sum(rate(workqueue_work_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P90 {{name}}" + } + ], + "title": "Seconds Processing Items from WorkQueue - 90th Percentile", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 88 + }, + "id": 12, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "histogram_quantile(0.99, sum(rate(workqueue_work_duration_seconds_bucket{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name, le))", + "legendFormat": "P99 {{name}}" + } + ], + "title": "Seconds Processing Items from WorkQueue - 99th Percentile", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": null, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 96 + }, + "id": 13, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(rate(workqueue_retries_total{job=\"$job\", namespace=\"$namespace\"}[5m])) by (instance, name)", + "legendFormat": "{{name}} {{instance}}" + } + ], + "title": "Work Queue Retries Rate", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(controller_runtime_reconcile_total, namespace)", + "refresh": 2, + "type": "query" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "job", + "query": "label_values(controller_runtime_reconcile_total{namespace=~\"$namespace\"}, job)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / Controller-Runtime", + "uid": "fleet-controller-runtime" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/gitrepo.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/gitrepo.json new file mode 100644 index 00000000..1a50c293 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/fleet/gitrepo.json @@ -0,0 +1,325 @@ +{ + "description": "GitRepo", + "graphTooltip": 1, + "panels": [ + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 0 + }, + "id": 1, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_gitrepo_desired_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 0 + }, + "id": 2, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_desired_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Clusters", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 8 + }, + "id": 3, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_desired_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_ready_clusters{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + } + ], + "title": "Clusters", + "type": "timeseries" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": "percentunit" + } + }, + "gridPos": { + "h": 5, + "w": 7, + "x": 0, + "y": 13 + }, + "id": 4, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_ready{exported_namespace=\"$namespace\",name=~\"$name\"}) / sum(fleet_gitrepo_resources_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})" + } + ], + "title": "Ready Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 5, + "w": 17, + "x": 7, + "y": 13 + }, + "id": 5, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_not_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + } + ], + "title": "Resources", + "type": "stat" + }, + { + "datasource": { + "type": "datasource", + "uid": "-- Mixed --" + }, + "fieldConfig": { + "defaults": { + "decimals": 0, + "unit": null + } + }, + "gridPos": { + "h": 8, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 6, + "pluginVersion": "v11.0.0", + "targets": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_desired_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Desired Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_not_ready{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Not Ready" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_missing{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Missing" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_modified{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Modified" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "expr": "sum(fleet_gitrepo_resources_unknown{exported_namespace=\"$namespace\",name=~\"$name\"})", + "legendFormat": "Unknown" + } + ], + "title": "Resources", + "type": "timeseries" + } + ], + "schemaVersion": 39, + "templating": { + "list": [ + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "name": "namespace", + "query": "label_values(fleet_gitrepo_desired_ready_clusters, exported_namespace)", + "refresh": 2, + "type": "query" + }, + { + "datasource": { + "type": "prometheus", + "uid": "prometheus" + }, + "includeAll": true, + "name": "name", + "query": "label_values(fleet_gitrepo_desired_ready_clusters{exported_namespace=~\"$namespace\"}, name)", + "refresh": 2, + "type": "query" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timezone": "utc", + "title": "Fleet / GitRepo", + "uid": "fleet-gitrepo" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/home/rancher-default-home.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/home/rancher-default-home.json new file mode 100644 index 00000000..3fce2075 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/home/rancher-default-home.json @@ -0,0 +1,1290 @@ +{ + "annotations": { + "list": [] + }, + "editable": false, + "gnetId": null, + "graphTooltip": 0, + "id": null, + "links": [], + "panels": [ + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 3, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 1, + "title": "", + "type": "welcome" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": true, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "percent", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": true, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 5, + "w": 8, + "x": 0, + "y": 4 + }, + "height": "180px", + "id": 6, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "(1 - (avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\",mode=\"idle\"}[5m])))) * 100", + "format": "time_series", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "65, 90", + "title": "CPU Utilization", + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": true, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "percent", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": true, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 5, + "w": 8, + "x": 8, + "y": 4 + }, + "height": "180px", + "id": 4, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "(1 - sum({__name__=~\"node_memory_MemAvailable_bytes|windows_os_physical_memory_free_bytes\"}) / sum({__name__=~\"node_memory_MemTotal_bytes|windows_cs_physical_memory_bytes\"})) * 100", + "format": "time_series", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "65, 90", + "title": "Memory Utilization", + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": true, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "percent", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": true, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 5, + "w": 8, + "x": 16, + "y": 4 + }, + "height": "180px", + "id": 7, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "(1 - (((sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) OR on() vector(0))) / ((sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) OR on() vector(0))))) * 100", + "format": "time_series", + "interval": "10s", + "intervalFactor": 1, + "metric": "", + "refId": "A", + "step": 10 + } + ], + "thresholds": "65, 90", + "title": "Disk Utilization", + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "none", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 0, + "y": 9 + }, + "height": "1px", + "id": 11, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": " cores", + "postfixFontSize": "30%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\",mode!=\"idle\"}[5m]))", + "format": "time_series", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "CPU Used", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "none", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 4, + "y": 9 + }, + "height": "1px", + "id": 12, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": " cores", + "postfixFontSize": "30%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(kube_node_status_allocatable_cpu_cores{}) OR sum(kube_node_status_allocatable{resource=\"cpu\",unit=\"core\"})", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "CPU Total", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "bytes", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 8, + "y": 9 + }, + "height": "1px", + "id": 9, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "20%", + "prefix": "", + "prefixFontSize": "20%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum({__name__=~\"node_memory_MemTotal_bytes|windows_cs_physical_memory_bytes\"}) - sum({__name__=~\"node_memory_MemAvailable_bytes|windows_os_physical_memory_free_bytes\"})", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "Memory Used", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "bytes", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 12, + "y": 9 + }, + "height": "1px", + "id": 10, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(kube_node_status_allocatable_memory_bytes{}) OR sum(kube_node_status_allocatable{resource=\"memory\", unit=\"byte\"})", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "Memory Total", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "bytes", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 16, + "y": 9 + }, + "height": "1px", + "id": 13, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "(sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"}) - sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) - sum(windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) OR on() vector(0))", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "Disk Used", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "Prometheus", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "format": "bytes", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 3, + "w": 4, + "x": 20, + "y": 9 + }, + "height": "1px", + "id": 14, + "interval": null, + "isNew": true, + "links": [], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "(sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) OR on() vector(0))", + "interval": "10s", + "intervalFactor": 1, + "refId": "A", + "step": 10 + } + ], + "thresholds": "", + "title": "Disk Total", + "type": "singlestat", + "valueFontSize": "50%", + "valueMaps": [ + { + "op": "=", + "text": "0", + "value": "null" + } + ], + "valueName": "current" + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "Prometheus", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 0, + "y": 12 + }, + "hiddenSeries": false, + "id": 2051, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\",mode=\"idle\"}[$__rate_interval])))", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 1, + "legendFormat": "Cluster", + "refId": "A" + }, + { + "expr": "1 - avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\", mode=\"idle\"}[$__rate_interval])) by (instance)", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{ instance }}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "percentunit", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "Prometheus", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 8, + "y": 12 + }, + "hiddenSeries": false, + "id": 2052, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "100 * (1 - sum({__name__=~\"node_memory_MemAvailable_bytes|windows_os_physical_memory_free_bytes\"}) / sum({__name__=~\"node_memory_MemTotal_bytes|windows_cs_physical_memory_bytes\"}))", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 1, + "legendFormat": "Cluster", + "refId": "A" + }, + { + "expr": "100 * (1- sum({__name__=~\"node_memory_MemAvailable_bytes|windows_os_physical_memory_free_bytes\"}) by (instance) / sum({__name__=~\"node_memory_MemTotal_bytes|windows_cs_physical_memory_bytes\"}) by (instance))", + "format": "time_series", + "hide": false, + "intervalFactor": 1, + "legendFormat": "{{ instance }}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "percent", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "Prometheus", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 6, + "w": 8, + "x": 16, + "y": 12 + }, + "hiddenSeries": false, + "id": 2053, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(1 - ((sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"} OR on() vector(0)))) / ((sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"}) OR on() vector(0)) + (sum(windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) OR on() vector(0)))) * 100", + "legendFormat": "Cluster", + "refId": "A" + }, + { + "expr": "(1 - (sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\"}) by (instance)) / sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\"}) by (instance)) * 100", + "hide": false, + "legendFormat": "{{ instance }}", + "refId": "B" + }, + { + "expr": "(1 - (sum(windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) by (instance)) / sum(windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\"}) by (instance)) * 100", + "hide": false, + "legendFormat": "{{ instance }}", + "refId": "C" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Usage", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "percent", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "folderId": 0, + "gridPos": { + "h": 15, + "w": 12, + "x": 0, + "y": 18 + }, + "headings": true, + "id": 3, + "limit": 30, + "links": [], + "query": "", + "recent": true, + "search": true, + "starred": false, + "tags": [], + "title": "Dashboards", + "type": "dashlist" + }, + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 18 + }, + "id": 2055, + "options": { + "content": "## About Rancher Monitoring\n\nRancher Monitoring is a Helm chart developed by Rancher that is powered by [Prometheus Operator](https://github.com/prometheus-operator/prometheus-operator). It is based on the upstream [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) Helm chart maintained by the Prometheus community.\n\nBy default, the chart deploys Grafana alongside a set of Grafana dashboards curated by the [kube-prometheus](https://github.com/prometheus-operator/kube-prometheus) project.\n\nFor more information on how Rancher Monitoring differs from [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack), please view the CHANGELOG.md of the rancher-monitoring chart located in the [rancher/charts](https://github.com/rancher/charts) repository.\n\nFor more information about how to configure Rancher Monitoring, please view the [Rancher docs](https://rancher.com/docs/rancher/v2.x/en/).\n\n", + "mode": "markdown" + }, + "pluginVersion": "7.1.0", + "timeFrom": null, + "timeShift": null, + "title": "", + "type": "text" + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "hidden": true, + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ], + "type": "timepicker" + }, + "timezone": "browser", + "title": "Home", + "uid": "rancher-home-1", + "version": 5 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd-nodes.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd-nodes.json new file mode 100644 index 00000000..8af4b81c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd-nodes.json @@ -0,0 +1,687 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 32, + "links": [], + "panels": [ + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "Prometheus", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(etcd_network_client_grpc_received_bytes_total{job=\"kube-etcd\"}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Client Traffic In ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(etcd_network_client_grpc_sent_bytes_total{job=\"kube-etcd\"}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Client Traffic Out ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "GRPC Client Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]({{instance}})" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(etcd_mvcc_db_total_size_in_bytes) by (instance)", + "interval": "", + "legendFormat": "DB Size ({{instance}})", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "DB Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"}) by (instance) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"}) by (instance)", + "interval": "", + "legendFormat": "Watch Streams ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"}) by (instance) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"}) by (instance)", + "interval": "", + "legendFormat": "Lease Watch Stream ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Active Streams", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(etcd_server_proposals_committed_total[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Proposal Committed ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(etcd_server_proposals_applied_total[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Proposal Applied ({{instance}})", + "refId": "B" + }, + { + "expr": "sum(rate(etcd_server_proposals_failed_total[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "Proposal Failed ({{instance}})", + "refId": "C" + }, + { + "expr": "sum(etcd_server_proposals_pending) by (instance)", + "interval": "", + "legendFormat": "Proposal Pending ({{instance}})", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Raft Proposals", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(grpc_server_started_total{grpc_type=\"unary\"}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "RPC Rate ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"}[$__rate_interval])) by (instance)", + "interval": "", + "legendFormat": "RPC Failure Rate ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "RPC Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 0, + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "decimals": null, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[$__rate_interval])) by (instance, le))", + "interval": "", + "legendFormat": "WAL fsync ({{instance}})", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[$__rate_interval])) by (instance, le))", + "interval": "", + "legendFormat": "DB fsync ({{instance}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Sync Duration", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 2, + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / etcd (Nodes)", + "uid": "rancher-etcd-nodes-1", + "version": 5 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd.json new file mode 100644 index 00000000..0c058caf --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-etcd.json @@ -0,0 +1,669 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 33, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "Prometheus", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(etcd_network_client_grpc_received_bytes_total{job=\"kube-etcd\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Client Traffic In", + "refId": "A" + }, + { + "expr": "sum(rate(etcd_network_client_grpc_sent_bytes_total{job=\"kube-etcd\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Client Traffic Out", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "GRPC Client Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(etcd_mvcc_db_total_size_in_bytes)", + "interval": "", + "legendFormat": "DB Size", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "DB Size", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Watch\",grpc_type=\"bidi_stream\"})", + "interval": "", + "legendFormat": "Watch Streams", + "refId": "A" + }, + { + "expr": "sum(grpc_server_started_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"}) - sum(grpc_server_handled_total{grpc_service=\"etcdserverpb.Lease\",grpc_type=\"bidi_stream\"})", + "interval": "", + "legendFormat": "Lease Watch Stream", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Active Streams", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(etcd_server_proposals_committed_total[$__rate_interval]))", + "interval": "", + "legendFormat": "Proposal Committed", + "refId": "A" + }, + { + "expr": "sum(rate(etcd_server_proposals_applied_total[$__rate_interval]))", + "interval": "", + "legendFormat": "Proposal Applied", + "refId": "B" + }, + { + "expr": "sum(rate(etcd_server_proposals_failed_total[$__rate_interval]))", + "interval": "", + "legendFormat": "Proposal Failed", + "refId": "C" + }, + { + "expr": "sum(etcd_server_proposals_pending)", + "interval": "", + "legendFormat": "Proposal Pending", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Raft Proposals", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(grpc_server_started_total{grpc_type=\"unary\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "RPC Rate", + "refId": "A" + }, + { + "expr": "sum(rate(grpc_server_handled_total{grpc_type=\"unary\",grpc_code!=\"OK\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "RPC Failure Rate", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "RPC Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 0, + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "decimals": null, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(0.99, sum(rate(etcd_disk_wal_fsync_duration_seconds_bucket[$__rate_interval])) by (instance, le))", + "interval": "", + "legendFormat": "WAL fsync", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.99, sum(rate(etcd_disk_backend_commit_duration_seconds_bucket[$__rate_interval])) by (instance, le))", + "interval": "", + "legendFormat": "DB fsync", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Sync Duration", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 2, + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / etcd", + "uid": "rancher-etcd-1", + "version": 4 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components-nodes.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components-nodes.json new file mode 100644 index 00000000..b31358ea --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components-nodes.json @@ -0,0 +1,527 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 30, + "links": [], + "panels": [ + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(apiserver_request_total[$__rate_interval])) by (instance, code)", + "interval": "", + "legendFormat": "{{code}}({{instance}})", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "API Server Request Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 0, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]({{instance}})" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"deployment\"}) by (instance, name)", + "interval": "", + "legendFormat": "Deployment Depth ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"volumes\"}) by (instance, name)", + "interval": "", + "legendFormat": "Volumes Depth ({{instance}})", + "refId": "B" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"replicaset\"}) by (instance, name)", + "interval": "", + "legendFormat": "ReplicaSet Depth ({{instance}})", + "refId": "C" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"service\"}) by (instance, name)", + "interval": "", + "legendFormat": "Service Depth ({{instance}})", + "refId": "D" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"serviceaccount\"}) by (instance, name)", + "interval": "", + "legendFormat": "ServiceAccount Depth ({{instance}})", + "refId": "E" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"endpoint\"}) by (instance, name)", + "interval": "", + "legendFormat": "Endpoint Depth ({{instance}})", + "refId": "F" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"daemonset\"}) by (instance, name)", + "interval": "", + "legendFormat": "DaemonSet Depth ({{instance}})", + "refId": "G" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"statefulset\"}) by (instance, name)", + "interval": "", + "legendFormat": "StatefulSet Depth ({{instance}})", + "refId": "H" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"replicationmanager\"}) by (instance, name)", + "interval": "", + "legendFormat": "ReplicationManager Depth ({{instance}})", + "refId": "I" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Controller Manager Queue Depth", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(kube_pod_status_scheduled{condition=\"false\"})", + "interval": "", + "legendFormat": "Failed To Schedule", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Pod Scheduling Status", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{instance}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"reading\"}) by (instance)", + "interval": "", + "legendFormat": "Reading ({{instance}})", + "refId": "A" + }, + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"waiting\"}) by (instance)", + "interval": "", + "legendFormat": "Waiting ({{instance}})", + "refId": "B" + }, + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"writing\"}) by (instance)", + "interval": "", + "legendFormat": "Writing ({{instance}})", + "refId": "C" + }, + { + "expr": "sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state=\"accepted\"}[$__rate_interval]))) by (instance)", + "interval": "", + "legendFormat": "Accepted ({{instance}})", + "refId": "D" + }, + { + "expr": "sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state=\"handled\"}[$__rate_interval]))) by (instance)", + "interval": "", + "legendFormat": "Handled ({{instance}})", + "refId": "E" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Ingress Controller Connections", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Kubernetes Components (Nodes)", + "uid": "rancher-k8s-components-nodes-1", + "version": 5 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components.json new file mode 100644 index 00000000..44cf97f9 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/k8s/rancher-k8s-components.json @@ -0,0 +1,519 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 31, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(apiserver_request_total[$__rate_interval])) by (code)", + "interval": "", + "legendFormat": "{{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "API Server Request Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 0, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]({{instance}})" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"deployment\"}) by (name)", + "interval": "", + "legendFormat": "Deployment Depth", + "refId": "A" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"volumes\"}) by (name)", + "interval": "", + "legendFormat": "Volumes Depth", + "refId": "B" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"replicaset\"}) by (name)", + "interval": "", + "legendFormat": "Replicaset Depth", + "refId": "C" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"service\"}) by (name)", + "interval": "", + "legendFormat": "Service Depth", + "refId": "D" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"serviceaccount\"}) by (name)", + "interval": "", + "legendFormat": "ServiceAccount Depth", + "refId": "E" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"endpoint\"}) by (name)", + "interval": "", + "legendFormat": "Endpoint Depth", + "refId": "F" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"daemonset\"}) by (name)", + "interval": "", + "legendFormat": "DaemonSet Depth", + "refId": "G" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"statefulset\"}) by (name)", + "interval": "", + "legendFormat": "StatefulSet Depth", + "refId": "H" + }, + { + "expr": "sum(workqueue_depth{component=\"kube-controller-manager\", name=\"replicationmanager\"}) by (name)", + "interval": "", + "legendFormat": "ReplicationManager Depth", + "refId": "I" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Controller Manager Queue Depth", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(kube_pod_status_scheduled{condition=\"false\"})", + "interval": "", + "legendFormat": "Failed To Schedule", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Pod Scheduling Status", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"reading\"})", + "interval": "", + "legendFormat": "Reading", + "refId": "A" + }, + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"waiting\"})", + "interval": "", + "legendFormat": "Waiting", + "refId": "B" + }, + { + "expr": "sum(nginx_ingress_controller_nginx_process_connections{state=\"writing\"})", + "interval": "", + "legendFormat": "Writing", + "refId": "C" + }, + { + "expr": "sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state=\"accepted\"}[$__rate_interval])))", + "interval": "", + "legendFormat": "Accepted", + "refId": "D" + }, + { + "expr": "sum(ceil(increase(nginx_ingress_controller_nginx_process_connections_total{state=\"handled\"}[$__rate_interval])))", + "interval": "", + "legendFormat": "Handled", + "refId": "E" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Ingress Controller Connections", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Kubernetes Components", + "uid": "rancher-k8s-components-1", + "version": 5 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node-detail.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node-detail.json new file mode 100644 index 00000000..920fb94c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node-detail.json @@ -0,0 +1,805 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "links": [], + "panels": [ + { + "aliasColors": { + "{{mode}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\", instance=\"$instance\"}[$__rate_interval])) by (mode)", + "interval": "", + "legendFormat": "{{mode}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_load5{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[5m]))", + "interval": "", + "legendFormat": "Load[5m]", + "refId": "A" + }, + { + "expr": "sum(node_load1{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[1m]))", + "interval": "", + "legendFormat": "Load[1m]", + "refId": "B" + }, + { + "expr": "sum(node_load15{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[15m]))", + "interval": "", + "legendFormat": "Load[15m]", + "refId": "C" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Load Average", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (node_memory_MemAvailable_bytes{instance=~\"$instance\"} OR windows_os_physical_memory_free_bytes{instance=~\"$instance\"}) / (node_memory_MemTotal_bytes{instance=~\"$instance\"} OR windows_cs_physical_memory_bytes{instance=~\"$instance\"})", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{device}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\", instance=~\"$instance\"} OR windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\", instance=~\"$instance\"}) by (device) / sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\", instance=~\"$instance\"} OR windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\", instance=~\"$instance\"}) by (device))", + "interval": "", + "legendFormat": "{{device}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{device}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_disk_read_bytes_total{instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_logical_disk_read_bytes_total{instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Read ({{device}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_disk_written_bytes_total{instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_logical_disk_write_bytes_total{instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Write ({{device}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{device}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_receive_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_received_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Receive Errors ({{device}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Receive Total ({{device}})", + "refId": "B" + }, + { + "expr": "sum(rate(node_network_transmit_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_outbound_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Transmit Errors ({{device}})", + "refId": "C" + }, + { + "expr": "sum(rate(node_network_receive_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_received_discarded_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Receive Dropped ({{device}})", + "refId": "D" + }, + { + "expr": "sum(rate(node_network_transmit_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_outbound_discarded{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Transmit Dropped ({{device}})", + "refId": "E" + }, + { + "expr": "sum(rate(node_network_transmit_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) by (device) OR sum(rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Transmit Total ({{device}})", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{device}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_transmit_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Transmit Total ({{device}})", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) by (device)", + "interval": "", + "legendFormat": "Receive Total ({{device}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "instance", + "query": "label_values({__name__=~\"node_exporter_build_info|windows_exporter_build_info\"}, instance)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Node (Detail)", + "uid": "rancher-node-detail-1", + "version": 3 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node.json new file mode 100644 index 00000000..367df3cc --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/nodes/rancher-node.json @@ -0,0 +1,792 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - avg(irate({__name__=~\"node_cpu_seconds_total|windows_cpu_time_total\", instance=\"$instance\", mode=\"idle\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [ + { + "matcher": { + "id": "byName", + "options": "Load[5m]" + }, + "properties": [] + } + ] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_load5{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[5m]))", + "interval": "", + "legendFormat": "Load[5m]", + "refId": "A" + }, + { + "expr": "sum(node_load1{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[1m]))", + "interval": "", + "legendFormat": "Load[1m]", + "refId": "B" + }, + { + "expr": "sum(node_load15{instance=~\"$instance\"} OR avg_over_time(windows_system_processor_queue_length{instance=~\"$instance\"}[15m]))", + "interval": "", + "legendFormat": "Load[15m]", + "refId": "C" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Load Average", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - sum(node_memory_MemAvailable_bytes{instance=~\"$instance\"} OR windows_os_physical_memory_free_bytes{instance=~\"$instance\"}) / sum(node_memory_MemTotal_bytes{instance=~\"$instance\"} OR windows_cs_physical_memory_bytes{instance=~\"$instance\"})", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "1 - (sum(node_filesystem_free_bytes{device!~\"rootfs|HarddiskVolume.+\", instance=~\"$instance\"} OR windows_logical_disk_free_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\", instance=~\"$instance\"}) / sum(node_filesystem_size_bytes{device!~\"rootfs|HarddiskVolume.+\", instance=~\"$instance\"} OR windows_logical_disk_size_bytes{volume!~\"(HarddiskVolume.+|[A-Z]:.+)\", instance=~\"$instance\"}))", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "percentunit", + "label": null, + "logBase": 1, + "max": "1", + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_disk_read_bytes_total{instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_logical_disk_read_bytes_total{instance=~\"$instance\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Read", + "refId": "A" + }, + { + "expr": "sum(rate(node_disk_written_bytes_total{instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_logical_disk_write_bytes_total{instance=~\"$instance\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Write", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 7 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(rate(node_network_receive_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Errors", + "refId": "A" + }, + { + "expr": "(sum(rate(node_network_receive_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "B" + }, + { + "expr": "(sum(rate(node_network_transmit_errs_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_outbound_errors_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Errors", + "refId": "C" + }, + { + "expr": "(sum(rate(node_network_receive_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_received_discarded_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Receive Dropped", + "refId": "D" + }, + { + "expr": "(sum(rate(node_network_transmit_drop_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_outbound_discarded{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Dropped", + "refId": "E" + }, + { + "expr": "(sum(rate(node_network_transmit_packets_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0)) + (sum(rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval])) OR on() vector(0))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 14 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(node_network_transmit_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_net_packets_sent_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "A" + }, + { + "expr": "sum(rate(node_network_receive_bytes_total{device!~\"lo|veth.*|docker.*|flannel.*|cali.*|cbr.*\", instance=~\"$instance\"}[$__rate_interval]) OR rate(windows_net_packets_received_total_total{nic!~'.*isatap.*|.*VPN.*|.*Pseudo.*|.*tunneling.*', instance=~\"$instance\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "instance", + "query": "label_values({__name__=~\"node_exporter_build_info|windows_exporter_build_info\"}, instance)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Node", + "uid": "rancher-node-1", + "version": 3 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/performance/performance-debugging.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/performance/performance-debugging.json new file mode 100644 index 00000000..b0de93c6 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/performance/performance-debugging.json @@ -0,0 +1,1649 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": { + "type": "datasource", + "uid": "grafana" + }, + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "target": { + "limit": 100, + "matchAny": false, + "tags": [], + "type": "dashboard" + }, + "type": "dashboard" + } + ] + }, + "editable": true, + "fiscalYearStartMonth": 0, + "graphTooltip": 0, + "links": [], + "liveNow": false, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 22, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": true, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (handler_name) (rate(lasso_controller_reconcile_time_seconds_sum[5m]))\n/\nsum by (handler_name) (rate(lasso_controller_reconcile_time_seconds_count[5m])))", + "interval": "", + "legendFormat": "{{handler_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Handler Average Execution Times Over Last 5 Minutes (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1390", + "format": "short", + "label": "Execution Time in Seconds", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1391", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, "datasource": { + "uid": "$datasource" + }, + "description": "", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 8 + }, + "hiddenSeries": false, + "id": 28, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (resource, method, code) (rate(steve_api_request_time_sum{resource!=\"subscribe\"}[5m]))\n/\nsum by (resource, method, code) (rate(steve_api_request_time_count{resource!=\"subscribe\"}[5m])))", + "interval": "", + "legendFormat": "{{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Rancher API Average Request Times Over Last 5 Minutes (Top 20) (Subscribes Omitted)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:178", + "format": "ms", + "label": "", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:179", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 16 + }, + "hiddenSeries": false, + "id": 30, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "rate(steve_api_request_time_sum{resource=\"subscribe\"}[5m])\n/\nrate(steve_api_request_time_count{resource=\"subscribe\"}[5m])", + "interval": "", + "legendFormat": "{{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Subscribe Average Request Times Over Last 5 Minutes", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:368", + "format": "ms", + "label": "", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:369", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 24 + }, + "hiddenSeries": false, + "id": 14, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,workqueue_depth)", + "interval": "", + "legendFormat": "{{name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Lasso Controller Work Queue Depth (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1553", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1554", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 13, + "w": 16, + "x": 0, + "y": 32 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": false, + "hideZero": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (id, resource, method, code) (steve_api_total_requests))", + "instant": false, + "interval": "", + "legendFormat": "{{id}} {{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Number of Rancher Requests (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:290", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:291", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 16, + "x": 0, + "y": 45 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (id, resource, method) (steve_api_total_requests{code!=\"200\",code!=\"201\"}))", + "interval": "", + "legendFormat": "{{id}} {{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Number of Failed Rancher API Requests (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:428", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:429", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 54 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (resource, method, code) (rate(k8s_proxy_store_request_time_sum[5m]))\n/\nsum by (resource, method, code) (rate(k8s_proxy_store_request_time_count[5m])))", + "interval": "", + "legendFormat": "{{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "K8s Proxy Store Average Request Times Over Last 5 Minutes (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:662", + "format": "ms", + "label": "", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:663", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 62 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": true, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (resource, method, code) (rate(k8s_proxy_client_request_time_sum[5m]))\n/\nsum by (resource, method, code) (rate(k8s_proxy_client_request_time_count[5m])))", + "interval": "", + "legendFormat": "{{resource}} {{method}} {{code}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "K8s Proxy Client Average Request Times Over Last 5 Minutes (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1710", + "format": "ms", + "label": "", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1711", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 70 + }, + "hiddenSeries": false, + "id": 10, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,lasso_controller_total_cached_object)", + "interval": "", + "legendFormat": "{{kind}} {{version}} {{group}} {{pod}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Cached Objects by GroupVersionKind (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:744", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:745", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, "datasource": { + "uid": "$datasource" + }, + "description": "", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 78 + }, + "hiddenSeries": false, + "id": 12, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (handler_name) (\nlasso_controller_total_handler_execution\n))", + "interval": "", + "legendFormat": "{{handler_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Lasso Handler Executions (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:824", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:825", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 86 + }, + "hiddenSeries": false, + "id": 32, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20, sum by (handler_name,controller_name) (\nincrease(lasso_controller_total_handler_execution[2m])\n))", + "interval": "", + "legendFormat": "{{controller_name}}.{{handler_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Handler Executions Over Last 2 Minutes (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "logBase": 1, + "show": true + }, + { + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 94 + }, + "hiddenSeries": false, + "id": 20, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (handler_name) (\nlasso_controller_total_handler_execution{has_error=\"true\"}\n))", + "interval": "", + "legendFormat": "{{handler_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Total Handler Executions with Error (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1230", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1231", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 102 + }, + "hiddenSeries": false, + "id": 34, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,sum by (handler_name,controller_name) (\nincrease(lasso_controller_total_handler_execution{has_error=\"true\"}[2m])\n))", + "interval": "", + "legendFormat": "{{controller_name}}.{{handler_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Handler Executions Over Last 2 Minutes (Top 20)", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "logBase": 1, + "show": true + }, + { + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 110 + }, + "hiddenSeries": false, + "id": 16, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "topk(20,session_server_total_transmit_bytes)", + "interval": "", + "legendFormat": "{{clientkey}} {{pod}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Data Transmitted by Remote Dialer Sessions (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:1953", + "format": "decbytes", + "label": "", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:1954", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, "datasource": { + "uid": "$datasource" + }, + "description": "", + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 118 + }, + "hiddenSeries": false, + "id": 18, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "exemplar": true, + "expr": "session_server_total_transmit_error_bytes", + "interval": "", + "legendFormat": "{{clientkey}} {{pod}}", + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Errors for Remote Dialer Sessions (Top 20)", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2045", + "format": "ms", + "label": "Error Data", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2046", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 126 + }, + "hiddenSeries": false, + "id": 26, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "editorMode": "code", + "exemplar": true, + "expr": "session_server_total_add_websocket_session - (session_server_total_remove_websocket_session or (0 * session_server_total_add_websocket_session))", + "interval": "", + "legendFormat": "{{clientkey}} {{pod}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Remote Dialer Active Connections (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2199", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2200", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 134 + }, + "hiddenSeries": false, + "id": 35, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "editorMode": "code", + "exemplar": true, + "expr": "rate(session_server_total_remove_connections[$__rate_interval])", + "interval": "", + "legendFormat": "{{clientkey}} {{pod}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Remote Dialer Removed Connections Rate (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2199", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2200", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": { + "type": "prometheus", + "uid": "$datasource" + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 16, + "x": 0, + "y": 142 + }, + "hiddenSeries": false, + "id": 24, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "9.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "datasource": { + "uid": "$datasource" + }, + "editorMode": "code", + "exemplar": true, + "expr": "rate(session_server_total_add_connections[$__rate_interval])", + "interval": "", + "legendFormat": "{{clientkey}} {{pod}}", + "range": true, + "refId": "A" + } + ], + "thresholds": [], + "timeRegions": [], + "title": "Remote Dialer Added Connections Rate (Top 20)", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "mode": "time", + "show": true, + "values": [] + }, + "yaxes": [ + { + "$$hashKey": "object:2117", + "format": "short", + "logBase": 1, + "show": true + }, + { + "$$hashKey": "object:2118", + "format": "short", + "logBase": 1, + "show": true + } + ], + "yaxis": { + "align": false + } + } + ], + "schemaVersion": 37, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "selected": false, + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "includeAll": false, + "label": "Data Source", + "multi": false, + "name": "datasource", + "options": [], + "query": "prometheus", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + } + ] + }, + "time": { + "from": "now-15m", + "to": "now" + }, + "timepicker": {}, + "timezone": "", + "title": "Rancher Performance Debugging", + "uid": "tfrfU0a7k", + "version": 1, + "weekStart": "" +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod-containers.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod-containers.json new file mode 100644 index 00000000..9e53081a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod-containers.json @@ -0,0 +1,636 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "iteration": 1618265214337, + "links": [], + "panels": [ + { + "aliasColors": { + "{{container}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(container_cpu_cfs_throttled_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "CFS throttled ({{container}})", + "refId": "A" + }, + { + "expr": "sum(rate(container_cpu_system_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container) OR sum(rate(windows_container_cpu_usage_seconds_kernelmode{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "System ({{container}})", + "refId": "B" + }, + { + "expr": "sum(rate(container_cpu_usage_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container) OR sum(rate(windows_container_cpu_usage_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Total ({{container}})", + "refId": "C" + }, + { + "expr": "sum(rate(container_cpu_user_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container) OR sum(rate(windows_container_cpu_usage_seconds_usermode{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "User ({{container}})", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "cpu", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{container}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"} OR windows_container_memory_usage_commit_bytes{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"}) by (container)", + "interval": "", + "legendFormat": "({{container}})", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{container}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_receive_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Receive Total ({{container}})", + "refId": "A" + }, + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_transmit_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Transmit Total ({{container}})", + "refId": "B" + }, + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_receive_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Receive Dropped ({{container}})", + "refId": "C" + }, + { + "expr": "sum(irate(container_network_receive_errors_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Receive Errors ({{container}})", + "refId": "D" + }, + { + "expr": "sum(irate(container_network_transmit_errors_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Transmit Errors ({{container}})", + "refId": "E" + }, + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_transmit_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Transmit Dropped ({{container}})", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{container}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_receive_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Receive Total ({{container}})", + "refId": "A" + }, + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container) OR sum(irate(windows_container_network_transmit_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Transmit Total ({{container}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{container}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(container_fs_writes_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Write ({{container}})", + "refId": "A" + }, + { + "expr": "sum(rate(container_fs_reads_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) by (container)", + "interval": "", + "legendFormat": "Read ({{container}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": false, + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "query": "label_values(kube_pod_info{cluster=\"$cluster\"}, namespace)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "pod", + "query": "label_values(kube_pod_info{cluster=\"$cluster\", namespace=\"$namespace\"}, pod)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Pod (Containers)", + "uid": "rancher-pod-containers-1", + "version": 8 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod.json new file mode 100644 index 00000000..65c6bf18 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/pods/rancher-pod.json @@ -0,0 +1,636 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "iteration": 1618265214337, + "links": [], + "panels": [ + { + "aliasColors": { + "": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(container_cpu_cfs_throttled_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "CFS throttled", + "refId": "A" + }, + { + "expr": "sum(rate(container_cpu_system_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) OR sum(rate(windows_container_cpu_usage_seconds_kernelmode{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "System", + "refId": "B" + }, + { + "expr": "sum(rate(container_cpu_usage_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) OR sum(rate(windows_container_cpu_usage_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Total", + "refId": "C" + }, + { + "expr": "sum(rate(container_cpu_user_seconds_total{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval])) OR sum(rate(windows_container_cpu_usage_seconds_usermode{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "User", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "cpu", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"} OR windows_container_memory_usage_commit_bytes{container!=\"POD\",namespace=~\"$namespace\",pod=~\"$pod\", container!=\"\"})", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_receive_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "A" + }, + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_transmit_packets_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "B" + }, + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_receive_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Dropped", + "refId": "C" + }, + { + "expr": "sum(irate(container_network_receive_errors_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Errors", + "refId": "D" + }, + { + "expr": "sum(irate(container_network_transmit_errors_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Errors", + "refId": "E" + }, + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_transmit_packets_dropped_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Dropped", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_receive_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Receive Total", + "refId": "A" + }, + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval])) OR sum(irate(windows_container_network_transmit_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(container_fs_writes_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Write", + "refId": "A" + }, + { + "expr": "sum(rate(container_fs_reads_bytes_total{namespace=~\"$namespace\",pod=~\"$pod\",container!=\"\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Read", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": false, + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "query": "label_values({__name__=~\"container_.*|windows_container_.*\", namespace!=\"\"}, namespace)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "pod", + "query": "label_values({__name__=~\"container_.*|windows_container_.*\", namespace=\"$namespace\", pod!=\"\"}, pod)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Pod", + "uid": "rancher-pod-1", + "version": 8 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload-pods.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload-pods.json new file mode 100644 index 00000000..f6b5078a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload-pods.json @@ -0,0 +1,652 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "iteration": 1618265214337, + "links": [], + "panels": [ + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(rate(container_cpu_cfs_throttled_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "CFS throttled ({{pod}})", + "refId": "A" + }, + { + "expr": "(sum(rate(container_cpu_system_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_kernelmode{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "System ({{pod}})", + "refId": "B" + }, + { + "expr": "(sum(rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Total ({{pod}})", + "refId": "C" + }, + { + "expr": "(sum(rate(container_cpu_user_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_usermode{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "User ({{pod}})", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "cpu", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(container_memory_working_set_bytes{namespace=~\"$namespace\"} OR windows_container_memory_usage_commit_bytes{namespace=~\"$namespace\"}) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "({{pod}})", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Receive Total ({{pod}})", + "refId": "A" + }, + { + "expr": "(sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Transmit Total ({{pod}})", + "refId": "B" + }, + { + "expr": "(sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Receive Dropped ({{pod}})", + "refId": "C" + }, + { + "expr": "(sum(irate(container_network_receive_errors_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Receive Errors ({{pod}})", + "refId": "D" + }, + { + "expr": "(sum(irate(container_network_transmit_errors_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Transmit Errors ({{pod}})", + "refId": "E" + }, + { + "expr": "(sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Transmit Dropped ({{pod}})", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Receive Total ({{pod}})", + "refId": "A" + }, + { + "expr": "(sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Transmit Total ({{pod}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(rate(container_fs_writes_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Write ({{pod}})", + "refId": "A" + }, + { + "expr": "(sum(rate(container_fs_reads_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"}", + "interval": "", + "legendFormat": "Read ({{pod}})", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": false, + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "query": "query_result(kube_pod_info{namespace!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*namespace=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "kind", + "query": "query_result(kube_pod_info{namespace=\"$namespace\", created_by_kind!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*created_by_kind=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "workload", + "query": "query_result(kube_pod_info{namespace=\"$namespace\", created_by_kind=\"$kind\", created_by_name!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*created_by_name=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Workload (Pods)", + "uid": "rancher-workload-pods-1", + "version": 8 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload.json b/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload.json new file mode 100644 index 00000000..9f5317c2 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/rancher/workloads/rancher-workload.json @@ -0,0 +1,652 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 28, + "iteration": 1618265214337, + "links": [], + "panels": [ + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 0 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum((sum(rate(container_cpu_cfs_throttled_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "CFS throttled", + "refId": "A" + }, + { + "expr": "sum((sum(rate(container_cpu_system_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_kernelmode{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "System", + "refId": "B" + }, + { + "expr": "sum((sum(rate(container_cpu_usage_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Total", + "refId": "C" + }, + { + "expr": "sum((sum(rate(container_cpu_user_seconds_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(rate(windows_container_cpu_usage_seconds_usermode{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "User", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "CPU Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "cpu", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 0 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum((sum(container_memory_working_set_bytes{namespace=~\"$namespace\"} OR windows_container_memory_usage_commit_bytes{namespace=~\"$namespace\"}) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Total", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Memory Utilization", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 16, + "y": 0 + }, + "hiddenSeries": false, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum((sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Receive Total", + "refId": "A" + }, + { + "expr": "sum((sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_packets_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "B" + }, + { + "expr": "sum((sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Receive Dropped", + "refId": "C" + }, + { + "expr": "sum((sum(irate(container_network_receive_errors_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Receive Errors", + "refId": "D" + }, + { + "expr": "sum((sum(irate(container_network_transmit_errors_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Transmit Errors", + "refId": "E" + }, + { + "expr": "sum((sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Transmit Dropped", + "refId": "F" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network Traffic", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 0, + "y": 7 + }, + "hiddenSeries": false, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum((sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_receive_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Receive Total", + "refId": "A" + }, + { + "expr": "sum((sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod) OR sum(irate(windows_container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Transmit Total", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Network I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + "{{pod}}": "#3797d5" + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 8, + "x": 8, + "y": 7 + }, + "hiddenSeries": false, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": false, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.5", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum((sum(rate(container_fs_writes_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Write", + "refId": "A" + }, + { + "expr": "sum((sum(rate(container_fs_reads_bytes_total{namespace=~\"$namespace\"}[$__rate_interval])) by (pod)) * on(pod) kube_pod_info{namespace=~\"$namespace\", created_by_kind=\"$kind\", created_by_name=\"$workload\"})", + "interval": "", + "legendFormat": "Read", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Disk I/O", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 1, + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": false, + "schemaVersion": 26, + "style": "dark", + "tags": [], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "query": "query_result(kube_pod_info{namespace!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*namespace=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "kind", + "query": "query_result(kube_pod_info{namespace=\"$namespace\", created_by_kind!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*created_by_kind=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "workload", + "query": "query_result(kube_pod_info{namespace=\"$namespace\", created_by_kind=\"$kind\", created_by_name!=\"\"} * on(pod) group_right(namespace, created_by_kind, created_by_name) count({__name__=~\"container_.*|windows_container_.*\", pod!=\"\"}) by (pod))", + "refresh": 2, + "regex": "/.*created_by_name=\"([^\"]*)\"/", + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "", + "title": "Rancher / Workload", + "uid": "rancher-workload-1", + "version": 8 +} diff --git a/charts/rancher-project-monitoring/0.4.2/files/upgrade/scripts/delete-workloads-with-old-labels.sh b/charts/rancher-project-monitoring/0.4.2/files/upgrade/scripts/delete-workloads-with-old-labels.sh new file mode 100644 index 00000000..89431e71 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/files/upgrade/scripts/delete-workloads-with-old-labels.sh @@ -0,0 +1,14 @@ +#!/bin/bash + +set -e +set -x + +# node-exporter +kubectl delete daemonset -l app=prometheus-node-exporter,release=rancher-monitoring --ignore-not-found=true + +# prometheus-adapter +kubectl delete deployments -l app=prometheus-adapter,release=rancher-monitoring --ignore-not-found=true + +# kube-state-metrics +kubectl delete deployments -l app.kubernetes.io/instance=rancher-monitoring,app.kubernetes.io/name=kube-state-metrics --cascade=orphan --ignore-not-found=true +kubectl delete statefulsets -l app.kubernetes.io/instance=rancher-monitoring,app.kubernetes.io/name=kube-state-metrics --cascade=orphan --ignore-not-found=true diff --git a/charts/rancher-project-monitoring/0.4.2/questions.yaml b/charts/rancher-project-monitoring/0.4.2/questions.yaml new file mode 100644 index 00000000..c15f36e7 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/questions.yaml @@ -0,0 +1,128 @@ +questions: +- variable: alertmanager.enabled + label: Enable Alertmanager + default: true + type: boolean + group: Alertmanager +- variable: grafana.enabled + label: Enable Grafana + default: true + type: boolean + group: Grafana + show_subquestion_if: true + subquestions: + - variable: grafana.adminUser + label: Grafana Admin User + type: string + default: admin + group: Grafana + - variable: grafana.adminPassword + label: Grafana Admin Password + type: string + default: prom-operator + group: Grafana + - variable: grafana.sidecar.dashboards.label + label: Default Grafana Dashboard Label + default: grafana_dashboard + description: All ConfigMaps with this label are parsed as Grafana Dashboards + type: string + group: Grafana +- variable: grafana.persistence.enabled + type: boolean + required: true + label: Enable Persistent Volume + show_subquestion_if: true + group: Grafana + subquestions: + - variable: grafana.persistence.size + type: string + default: 1Gi + label: Size + - variable: grafana.persistence.storageClass + type: storageclass + label: Storage Class Name + - variable: grafana.persistence.accessModes + type: enum + default: + - ReadWriteOnce + options: + - [ReadWriteOnce] + - [ReadWriteMany] + - [ReadOnlyMany] + label: Access Mode +- variable: prometheus.prometheusSpec.scrapeInterval + type: string + required: true + label: Scrape Interval + default: 30s + group: Prometheus +- variable: prometheus.prometheusSpec.evaluationInterval + type: string + required: true + label: Evaluation Interval + default: 1m + group: Prometheus +- variable: prometheus.prometheusSpec.retention + type: string + required: true + label: Retention + default: 10d + group: Prometheus +- variable: prometheus.prometheusSpec.retentionSize + type: string + required: false + label: Retention Size + default: 50GB + group: Prometheus +- variable: prometheus.prometheusSpec.resources.requests.cpu + type: string + required: true + label: Requested CPU + default: 750m + group: Prometheus +- variable: prometheus.prometheusSpec.resources.requests.memory + type: string + required: true + label: Requested Memory + default: 750Mi + group: Prometheus +- variable: prometheus.prometheusSpec.resources.limits.cpu + type: string + required: true + label: CPU Limit + default: 1000m + group: Prometheus +- variable: prometheus.prometheusSpec.resources.limits.memory + type: string + required: true + label: Memory Limit + default: 3000Mi + group: Prometheus +- variable: prometheus.prometheusSpec.storage.enabled + type: boolean + required: true + label: Enable Persistent Volume + show_subquestion_if: true + group: Prometheus + subquestions: + - variable: prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage + type: string + default: 50Gi + label: Size + - variable: prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName + type: storageclass + label: Storage Class Name + - variable: prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.accessModes + type: enum + default: + - ReadWriteOnce + options: + - [ReadWriteOnce] + - [ReadWriteMany] + - [ReadOnlyMany] + label: Access Mode +- variable: federate.enabled + label: Enable Federated Metrics From Cluster Prometheus + default: true + type: boolean + group: Federation diff --git a/charts/rancher-project-monitoring/0.4.2/templates/NOTES.txt b/charts/rancher-project-monitoring/0.4.2/templates/NOTES.txt new file mode 100644 index 00000000..9e4d6d87 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/NOTES.txt @@ -0,0 +1,4 @@ +{{ $.Chart.Name }} has been installed. Check its status by running: + kubectl --namespace {{ template "project-prometheus-stack.namespace" . }} get pods -l "release={{ $.Release.Name }}" + +Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator. diff --git a/charts/rancher-project-monitoring/0.4.2/templates/_helpers.tpl b/charts/rancher-project-monitoring/0.4.2/templates/_helpers.tpl new file mode 100644 index 00000000..5a71dec9 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/_helpers.tpl @@ -0,0 +1,458 @@ +# Rancher +{{- define "system_default_registry" -}} +{{- if .Values.global.cattle.systemDefaultRegistry -}} +{{- printf "%s/" .Values.global.cattle.systemDefaultRegistry -}} +{{- end -}} +{{- end -}} + +{{- define "monitoring_registry" -}} + {{- $temp_registry := (include "system_default_registry" .) -}} + {{- if $temp_registry -}} + {{- trimSuffix "/" $temp_registry -}} + {{- else -}} + {{- .Values.global.imageRegistry -}} + {{- end -}} +{{- end -}} + +{{/* +https://github.com/helm/helm/issues/4535#issuecomment-477778391 +Usage: {{ include "call-nested" (list . "SUBCHART_NAME" "TEMPLATE") }} +e.g. {{ include "call-nested" (list . "grafana" "grafana.fullname") }} +*/}} +{{- define "call-nested" }} +{{- $dot := index . 0 }} +{{- $subchart := index . 1 | splitList "." }} +{{- $template := index . 2 }} +{{- $values := $dot.Values }} +{{- range $subchart }} +{{- $values = index $values . }} +{{- end }} +{{- include $template (dict "Chart" (dict "Name" (last $subchart)) "Values" $values "Release" $dot.Release "Capabilities" $dot.Capabilities) }} +{{- end }} + +# Special Exporters +{{- define "exporter.kubeEtcd.enabled" -}} +{{- if or .Values.kubeEtcd.enabled .Values.rkeEtcd.enabled .Values.kubeAdmEtcd.enabled .Values.rke2Etcd.enabled -}} +"true" +{{- end -}} +{{- end }} + +{{- define "exporter.kubeControllerManager.enabled" -}} +{{- if or .Values.kubeControllerManager.enabled .Values.rkeControllerManager.enabled .Values.k3sServer.enabled .Values.kubeAdmControllerManager.enabled .Values.rke2ControllerManager.enabled -}} +"true" +{{- end -}} +{{- end }} + +{{- define "exporter.kubeScheduler.enabled" -}} +{{- if or .Values.kubeScheduler.enabled .Values.rkeScheduler.enabled .Values.k3sServer.enabled .Values.kubeAdmScheduler.enabled .Values.rke2Scheduler.enabled -}} +"true" +{{- end -}} +{{- end }} + +{{- define "exporter.kubeProxy.enabled" -}} +{{- if or .Values.kubeProxy.enabled .Values.rkeProxy.enabled .Values.k3sServer.enabled .Values.kubeAdmProxy.enabled .Values.rke2Proxy.enabled -}} +"true" +{{- end -}} +{{- end }} + +{{- define "exporter.kubelet.enabled" -}} +{{- if or .Values.kubelet.enabled .Values.hardenedKubelet.enabled .Values.k3sServer.enabled -}} +"true" +{{- end -}} +{{- end }} + +{{- define "exporter.kubeControllerManager.jobName" -}} +{{- if .Values.k3sServer.enabled -}} +k3s-server +{{- else -}} +kube-controller-manager +{{- end -}} +{{- end }} + +{{- define "exporter.kubeScheduler.jobName" -}} +{{- if .Values.k3sServer.enabled -}} +k3s-server +{{- else -}} +kube-scheduler +{{- end -}} +{{- end }} + +{{- define "exporter.kubeProxy.jobName" -}} +{{- if .Values.k3sServer.enabled -}} +k3s-server +{{- else -}} +kube-proxy +{{- end -}} +{{- end }} + +{{- define "exporter.kubelet.jobName" -}} +{{- if .Values.k3sServer.enabled -}} +k3s-server +{{- else -}} +kubelet +{{- end -}} +{{- end }} + +{{- define "kubelet.serviceMonitor.resourcePath" -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if not (eq .Values.kubelet.serviceMonitor.resourcePath "/metrics/resource/v1alpha1") -}} +{{ .Values.kubelet.serviceMonitor.resourcePath }} +{{- else if semverCompare ">=1.20.0-0" $kubeTargetVersion -}} +/metrics/resource +{{- else -}} +/metrics/resource/v1alpha1 +{{- end -}} +{{- end }} + +{{- define "rancher.serviceMonitor.selector" -}} +{{- if .Values.rancherMonitoring.selector }} +{{ .Values.rancherMonitoring.selector | toYaml }} +{{- else }} +{{- $rancherDeployment := (lookup "apps/v1" "Deployment" "cattle-system" "rancher") }} +{{- if $rancherDeployment }} +matchLabels: + app: rancher + chart: {{ index $rancherDeployment.metadata.labels "chart" }} + release: rancher +{{- end }} +{{- end }} +{{- end }} + +# Windows Support + +{{/* +Windows cluster will add default taint for linux nodes, +add below linux tolerations to workloads could be scheduled to those linux nodes +*/}} + +{{- define "linux-node-tolerations" -}} +- key: "cattle.io/os" + value: "linux" + effect: "NoSchedule" + operator: "Equal" +{{- end -}} + +{{- define "linux-node-selector" -}} +{{- if semverCompare "<1.14-0" .Capabilities.KubeVersion.GitVersion -}} +beta.kubernetes.io/os: linux +{{- else -}} +kubernetes.io/os: linux +{{- end -}} +{{- end -}} + +# Prometheus Operator + +{{/* Comma-delimited list of namespaces that need to be watched to configure Project Prometheus Stack components */}} +{{- define "project-prometheus-stack.projectNamespaceList" -}} +{{ append .Values.global.cattle.projectNamespaces .Release.Namespace | uniq | join "," }} +{{- end }} + +{{/* vim: set filetype=mustache: */}} +{{/* Expand the name of the chart. This is suffixed with -alertmanager, which means subtract 13 from longest 63 available */}} +{{- define "project-prometheus-stack.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 50 | trimSuffix "-" -}} +{{- end }} + +{{/* +Create a default fully qualified app name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +If release name contains chart name it will be used as a full name. +The components in this chart create additional resources that expand the longest created name strings. +The longest name that gets created adds and extra 37 characters, so truncation should be 63-35=26. +*/}} +{{- define "project-prometheus-stack.fullname" -}} +{{- if .Values.fullnameOverride -}} +{{- .Values.fullnameOverride | trunc 26 | trimSuffix "-" -}} +{{- else -}} +{{- $name := default .Chart.Name .Values.nameOverride -}} +{{- if contains $name .Release.Name -}} +{{- .Release.Name | trunc 26 | trimSuffix "-" -}} +{{- else -}} +{{- printf "%s-%s" .Release.Name $name | trunc 26 | trimSuffix "-" -}} +{{- end -}} +{{- end -}} +{{- end -}} + +{{/* Fullname suffixed with operator */}} +{{- define "project-prometheus-stack.operator.fullname" -}} +{{- printf "%s-operator" (include "project-prometheus-stack.fullname" .) -}} +{{- end }} + +{{/* Prometheus custom resource instance name */}} +{{- define "project-prometheus-stack.prometheus.crname" -}} +{{- if .Values.cleanPrometheusOperatorObjectNames }} +{{- include "project-prometheus-stack.fullname" . }} +{{- else }} +{{- print (include "project-prometheus-stack.fullname" .) "-prometheus" }} +{{- end }} +{{- end }} + +{{/* Prometheus apiVersion for networkpolicy */}} +{{- define "project-prometheus-stack.prometheus.networkPolicy.apiVersion" -}} +{{- print "networking.k8s.io/v1" -}} +{{- end }} + +{{/* Alertmanager custom resource instance name */}} +{{- define "project-prometheus-stack.alertmanager.crname" -}} +{{- if .Values.cleanPrometheusOperatorObjectNames }} +{{- include "project-prometheus-stack.fullname" . }} +{{- else }} +{{- print (include "project-prometheus-stack.fullname" .) "-alertmanager" -}} +{{- end }} +{{- end }} + +{{/* Fullname suffixed with thanos-ruler */}} +{{- define "project-prometheus-stack.thanosRuler.fullname" -}} +{{- printf "%s-thanos-ruler" (include "project-prometheus-stack.fullname" .) -}} +{{- end }} + +{{/* Shortened name suffixed with thanos-ruler */}} +{{- define "project-prometheus-stack.thanosRuler.name" -}} +{{- default (printf "%s-thanos-ruler" (include "project-prometheus-stack.name" .)) .Values.thanosRuler.name -}} +{{- end }} + + +{{/* Create chart name and version as used by the chart label. */}} +{{- define "project-prometheus-stack.chartref" -}} +{{- replace "+" "_" .Chart.Version | printf "%s-%s" .Chart.Name -}} +{{- end }} + +{{/* Generate basic labels */}} +{{- define "project-prometheus-stack.labels" }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +app.kubernetes.io/instance: {{ .Release.Name }} +app.kubernetes.io/version: "{{ replace "+" "_" .Chart.Version }}" +app.kubernetes.io/part-of: {{ template "project-prometheus-stack.name" . }} +chart: {{ template "project-prometheus-stack.chartref" . }} +release: {{ $.Release.Name | quote }} +heritage: {{ $.Release.Service | quote }} +{{- if .Values.commonLabels}} +{{ toYaml .Values.commonLabels }} +{{- end }} +{{- end }} + +{{/* Create the name of project-prometheus-stack service account to use */}} +{{- define "project-prometheus-stack.operator.serviceAccountName" -}} +{{- if .Values.prometheusOperator.serviceAccount.create -}} + {{ default (include "project-prometheus-stack.operator.fullname" .) .Values.prometheusOperator.serviceAccount.name }} +{{- else -}} + {{ default "default" .Values.prometheusOperator.serviceAccount.name }} +{{- end -}} +{{- end -}} + +{{/* Create the name of prometheus service account to use */}} +{{- define "project-prometheus-stack.prometheus.serviceAccountName" -}} +{{- if .Values.prometheus.serviceAccount.create -}} + {{ default (print (include "project-prometheus-stack.fullname" .) "-prometheus") .Values.prometheus.serviceAccount.name }} +{{- else -}} + {{ default "default" .Values.prometheus.serviceAccount.name }} +{{- end -}} +{{- end -}} + +{{/* Create the name of alertmanager service account to use */}} +{{- define "project-prometheus-stack.alertmanager.serviceAccountName" -}} +{{- if .Values.alertmanager.serviceAccount.create -}} + {{ default (print (include "project-prometheus-stack.fullname" .) "-alertmanager") .Values.alertmanager.serviceAccount.name }} +{{- else -}} + {{ default "default" .Values.alertmanager.serviceAccount.name }} +{{- end -}} +{{- end -}} + +{{/* Create the name of thanosRuler service account to use */}} +{{- define "project-prometheus-stack.thanosRuler.serviceAccountName" -}} +{{- if .Values.thanosRuler.serviceAccount.create -}} + {{ default (include "project-prometheus-stack.thanosRuler.name" .) .Values.thanosRuler.serviceAccount.name }} +{{- else -}} + {{ default "default" .Values.thanosRuler.serviceAccount.name }} +{{- end -}} +{{- end -}} + +{{/* +Allow the release namespace to be overridden for multi-namespace deployments in combined charts +*/}} +{{- define "project-prometheus-stack.namespace" -}} + {{- if .Values.namespaceOverride -}} + {{- .Values.namespaceOverride -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* +Use the grafana namespace override for multi-namespace deployments in combined charts +*/}} +{{- define "project-prometheus-stack-grafana.namespace" -}} + {{- if .Values.grafana.namespaceOverride -}} + {{- .Values.grafana.namespaceOverride -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* +Use the kube-state-metrics namespace override for multi-namespace deployments in combined charts +*/}} +{{- define "project-prometheus-stack-kube-state-metrics.namespace" -}} + {{- if index .Values "kube-state-metrics" "namespaceOverride" -}} + {{- index .Values "kube-state-metrics" "namespaceOverride" -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* +Use the prometheus-node-exporter namespace override for multi-namespace deployments in combined charts +*/}} +{{- define "project-prometheus-stack-prometheus-node-exporter.namespace" -}} + {{- if index .Values "prometheus-node-exporter" "namespaceOverride" -}} + {{- index .Values "prometheus-node-exporter" "namespaceOverride" -}} + {{- else -}} + {{- .Release.Namespace -}} + {{- end -}} +{{- end -}} + +{{/* Allow KubeVersion to be overridden. */}} +{{- define "project-prometheus-stack.kubeVersion" -}} + {{- default .Capabilities.KubeVersion.Version .Values.kubeVersionOverride -}} +{{- end -}} + +{{/* Get Ingress API Version */}} +{{- define "project-prometheus-stack.ingress.apiVersion" -}} + {{- if and (.Capabilities.APIVersions.Has "networking.k8s.io/v1") (semverCompare ">= 1.19-0" (include "project-prometheus-stack.kubeVersion" .)) -}} + {{- print "networking.k8s.io/v1" -}} + {{- else if .Capabilities.APIVersions.Has "networking.k8s.io/v1beta1" -}} + {{- print "networking.k8s.io/v1beta1" -}} + {{- else -}} + {{- print "extensions/v1beta1" -}} + {{- end -}} +{{- end -}} + +{{/* Check Ingress stability */}} +{{- define "project-prometheus-stack.ingress.isStable" -}} + {{- eq (include "project-prometheus-stack.ingress.apiVersion" .) "networking.k8s.io/v1" -}} +{{- end -}} + +{{/* Check Ingress supports pathType */}} +{{/* pathType was added to networking.k8s.io/v1beta1 in Kubernetes 1.18 */}} +{{- define "project-prometheus-stack.ingress.supportsPathType" -}} + {{- or (eq (include "project-prometheus-stack.ingress.isStable" .) "true") (and (eq (include "project-prometheus-stack.ingress.apiVersion" .) "networking.k8s.io/v1beta1") (semverCompare ">= 1.18-0" (include "project-prometheus-stack.kubeVersion" .))) -}} +{{- end -}} + +{{/* Get Policy API Version */}} +{{- define "project-prometheus-stack.pdb.apiVersion" -}} + {{- if and (.Capabilities.APIVersions.Has "policy/v1") (semverCompare ">= 1.21-0" (include "project-prometheus-stack.kubeVersion" .)) -}} + {{- print "policy/v1" -}} + {{- else -}} + {{- print "policy/v1beta1" -}} + {{- end -}} + {{- end -}} + +{{/* Get value based on current Kubernetes version */}} +{{- define "project-prometheus-stack.kubeVersionDefaultValue" -}} + {{- $values := index . 0 -}} + {{- $kubeVersion := index . 1 -}} + {{- $old := index . 2 -}} + {{- $new := index . 3 -}} + {{- $default := index . 4 -}} + {{- if kindIs "invalid" $default -}} + {{- if semverCompare $kubeVersion (include "project-prometheus-stack.kubeVersion" $values) -}} + {{- print $new -}} + {{- else -}} + {{- print $old -}} + {{- end -}} + {{- else -}} + {{- print $default }} + {{- end -}} +{{- end -}} + +{{/* Get value for kube-controller-manager depending on insecure scraping availability */}} +{{- define "project-prometheus-stack.kubeControllerManager.insecureScrape" -}} + {{- $values := index . 0 -}} + {{- $insecure := index . 1 -}} + {{- $secure := index . 2 -}} + {{- $userValue := index . 3 -}} + {{- include "project-prometheus-stack.kubeVersionDefaultValue" (list $values ">= 1.22-0" $insecure $secure $userValue) -}} +{{- end -}} + +{{/* Get value for kube-scheduler depending on insecure scraping availability */}} +{{- define "project-prometheus-stack.kubeScheduler.insecureScrape" -}} + {{- $values := index . 0 -}} + {{- $insecure := index . 1 -}} + {{- $secure := index . 2 -}} + {{- $userValue := index . 3 -}} + {{- include "project-prometheus-stack.kubeVersionDefaultValue" (list $values ">= 1.23-0" $insecure $secure $userValue) -}} +{{- end -}} + +{{/* Sets default scrape limits for servicemonitor */}} +{{- define "servicemonitor.scrapeLimits" -}} +{{- with .sampleLimit }} +sampleLimit: {{ . }} +{{- end }} +{{- with .targetLimit }} +targetLimit: {{ . }} +{{- end }} +{{- with .labelLimit }} +labelLimit: {{ . }} +{{- end }} +{{- with .labelNameLengthLimit }} +labelNameLengthLimit: {{ . }} +{{- end }} +{{- with .labelValueLengthLimit }} +labelValueLengthLimit: {{ . }} +{{- end }} +{{- end -}} + +{{/* +To help compatibility with other charts which use global.imagePullSecrets. +Allow either an array of {name: pullSecret} maps (k8s-style), or an array of strings (more common helm-style). +global: + imagePullSecrets: + - name: pullSecret1 + - name: pullSecret2 + +or + +global: + imagePullSecrets: + - pullSecret1 + - pullSecret2 +*/}} +{{- define "project-prometheus-stack.imagePullSecrets" -}} +{{- range .Values.global.imagePullSecrets }} + {{- if eq (typeOf .) "map[string]interface {}" }} +- {{ toYaml . | trim }} + {{- else }} +- name: {{ . }} + {{- end }} +{{- end }} +{{- end -}} + + +{{/* Define ingress for all hardened namespaces */}} +{{- define "project-prometheus-stack.hardened.networkPolicy.ingress" -}} +{{- $root := index . 0 }} +{{- $ns := index . 1 }} +{{- if $root.Values.global.networkPolicy.ingress -}} +{{ toYaml $root.Values.global.networkPolicy.ingress }} +{{- end }} +{{- if $root.Values.global.networkPolicy.limitIngressToProject }} +- from: +{{- if $root.Values.global.cattle.projectNamespaceSelector }} + - namespaceSelector: {{- $root.Values.global.cattle.projectNamespaceSelector | toYaml | nindent 6 }} +{{- end }} + - namespaceSelector: + matchLabels: + kubernetes.io/metadata.name: {{ $ns }} +{{- end }} +{{- end -}} + +{{- define "project-prometheus-stack.operator.admission-webhook.dnsNames" }} +{{- $fullname := include "project-prometheus-stack.operator.fullname" . }} +{{- $namespace := include "project-prometheus-stack.namespace" . }} +{{- $fullname }} +{{ $fullname }}.{{ $namespace }}.svc +{{- if .Values.prometheusOperator.admissionWebhooks.deployment.enabled }} +{{ $fullname }}-webhook +{{ $fullname }}-webhook.{{ $namespace }}.svc +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/alertmanager.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/alertmanager.yaml new file mode 100644 index 00000000..ddd5f508 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/alertmanager.yaml @@ -0,0 +1,186 @@ +{{- if .Values.alertmanager.enabled }} +apiVersion: monitoring.coreos.com/v1 +kind: Alertmanager +metadata: + name: {{ template "project-prometheus-stack.alertmanager.crname" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.alertmanager.annotations }} + annotations: +{{ toYaml .Values.alertmanager.annotations | indent 4 }} +{{- end }} +spec: +{{- if .Values.alertmanager.alertmanagerSpec.image }} + {{- $registry := include "monitoring_registry" . | default .Values.alertmanager.alertmanagerSpec.image.registry }} + {{- if and .Values.alertmanager.alertmanagerSpec.image.tag .Values.alertmanager.alertmanagerSpec.image.sha }} + image: "{{ $registry }}/{{ .Values.alertmanager.alertmanagerSpec.image.repository }}:{{ .Values.alertmanager.alertmanagerSpec.image.tag }}@sha256:{{ .Values.alertmanager.alertmanagerSpec.image.sha }}" + {{- else if .Values.alertmanager.alertmanagerSpec.image.sha }} + image: "{{ $registry }}/{{ .Values.alertmanager.alertmanagerSpec.image.repository }}@sha256:{{ .Values.alertmanager.alertmanagerSpec.image.sha }}" + {{- else if .Values.alertmanager.alertmanagerSpec.image.tag }} + image: "{{ $registry }}/{{ .Values.alertmanager.alertmanagerSpec.image.repository }}:{{ .Values.alertmanager.alertmanagerSpec.image.tag }}" + {{- else }} + image: "{{ $registry }}/{{ .Values.alertmanager.alertmanagerSpec.image.repository }}" + {{- end }} + version: {{ .Values.alertmanager.alertmanagerSpec.image.tag }} + {{- if .Values.alertmanager.alertmanagerSpec.image.sha }} + sha: {{ .Values.alertmanager.alertmanagerSpec.image.sha }} + {{- end }} +{{- end }} + replicas: {{ .Values.alertmanager.alertmanagerSpec.replicas }} + listenLocal: {{ .Values.alertmanager.alertmanagerSpec.listenLocal }} + serviceAccountName: {{ template "project-prometheus-stack.alertmanager.serviceAccountName" . }} + automountServiceAccountToken: {{ .Values.alertmanager.alertmanagerSpec.automountServiceAccountToken }} +{{- if .Values.alertmanager.alertmanagerSpec.externalUrl }} + externalUrl: "{{ tpl .Values.alertmanager.alertmanagerSpec.externalUrl . }}" +{{- else if and .Values.alertmanager.ingress.enabled .Values.alertmanager.ingress.hosts }} + externalUrl: "http://{{ tpl (index .Values.alertmanager.ingress.hosts 0) . }}{{ .Values.alertmanager.alertmanagerSpec.routePrefix }}" +{{- else if not (or (kindIs "invalid" .Values.global.cattle.url) (kindIs "invalid" .Values.global.cattle.clusterId)) }} + externalUrl: "{{ .Values.global.cattle.url }}/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ .Values.namespaceOverride }}/services/http:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}/proxy" +{{- else }} + externalUrl: http://{{ template "project-prometheus-stack.fullname" . }}-alertmanager.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.alertmanager.service.port }} +{{- end }} + nodeSelector: {{ include "linux-node-selector" . | nindent 4 }} +{{- if .Values.alertmanager.alertmanagerSpec.nodeSelector }} +{{ toYaml .Values.alertmanager.alertmanagerSpec.nodeSelector | indent 4 }} +{{- end }} + paused: {{ .Values.alertmanager.alertmanagerSpec.paused }} + logFormat: {{ .Values.alertmanager.alertmanagerSpec.logFormat | quote }} + logLevel: {{ .Values.alertmanager.alertmanagerSpec.logLevel | quote }} + retention: {{ .Values.alertmanager.alertmanagerSpec.retention | quote }} +{{- if .Values.alertmanager.alertmanagerSpec.secrets }} + secrets: +{{ toYaml .Values.alertmanager.alertmanagerSpec.secrets | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.configSecret }} + configSecret: {{ .Values.alertmanager.alertmanagerSpec.configSecret }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.configMaps }} + configMaps: +{{ toYaml .Values.alertmanager.alertmanagerSpec.configMaps | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.alertmanagerConfigSelector }} + alertmanagerConfigSelector: +{{ toYaml .Values.alertmanager.alertmanagerSpec.alertmanagerConfigSelector | indent 4}} +{{ else }} + alertmanagerConfigSelector: {} +{{- end }} + alertmanagerConfigNamespaceSelector: {{ .Values.global.cattle.projectNamespaceSelector | toYaml | nindent 4 }} +{{- if .Values.alertmanager.alertmanagerSpec.web }} + web: +{{ toYaml .Values.alertmanager.alertmanagerSpec.web | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.alertmanagerConfiguration }} + alertmanagerConfiguration: +{{ toYaml .Values.alertmanager.alertmanagerSpec.alertmanagerConfiguration | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.alertmanagerConfigMatcherStrategy }} + alertmanagerConfigMatcherStrategy: +{{ toYaml .Values.alertmanager.alertmanagerSpec.alertmanagerConfigMatcherStrategy | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.resources }} + resources: +{{ toYaml .Values.alertmanager.alertmanagerSpec.resources | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.routePrefix }} + routePrefix: "{{ .Values.alertmanager.alertmanagerSpec.routePrefix }}" +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.securityContext }} + securityContext: +{{ toYaml .Values.alertmanager.alertmanagerSpec.securityContext | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.storage }} + storage: +{{ tpl (toYaml .Values.alertmanager.alertmanagerSpec.storage | indent 4) . }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.podMetadata }} + podMetadata: +{{ toYaml .Values.alertmanager.alertmanagerSpec.podMetadata | indent 4 }} +{{- end }} +{{- if or .Values.alertmanager.alertmanagerSpec.podAntiAffinity .Values.alertmanager.alertmanagerSpec.affinity }} + affinity: +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.affinity }} +{{ toYaml .Values.alertmanager.alertmanagerSpec.affinity | indent 4 }} +{{- end }} +{{- if eq .Values.alertmanager.alertmanagerSpec.podAntiAffinity "hard" }} + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - topologyKey: {{ .Values.alertmanager.alertmanagerSpec.podAntiAffinityTopologyKey }} + labelSelector: + matchExpressions: + - {key: app.kubernetes.io/name, operator: In, values: [alertmanager]} + - {key: alertmanager, operator: In, values: [{{ template "project-prometheus-stack.alertmanager.crname" . }}]} +{{- else if eq .Values.alertmanager.alertmanagerSpec.podAntiAffinity "soft" }} + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + topologyKey: {{ .Values.alertmanager.alertmanagerSpec.podAntiAffinityTopologyKey }} + labelSelector: + matchExpressions: + - {key: app.kubernetes.io/name, operator: In, values: [alertmanager]} + - {key: alertmanager, operator: In, values: [{{ template "project-prometheus-stack.alertmanager.crname" . }}]} +{{- end }} + tolerations: {{ include "linux-node-tolerations" . | nindent 4 }} +{{- if .Values.alertmanager.alertmanagerSpec.tolerations }} +{{ toYaml .Values.alertmanager.alertmanagerSpec.tolerations | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.topologySpreadConstraints }} + topologySpreadConstraints: +{{ toYaml .Values.alertmanager.alertmanagerSpec.topologySpreadConstraints | indent 4 }} +{{- end }} +{{- if .Values.global.imagePullSecrets }} + imagePullSecrets: +{{ include "project-prometheus-stack.imagePullSecrets" . | trim | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.containers }} + containers: +{{ toYaml .Values.alertmanager.alertmanagerSpec.containers | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.initContainers }} + initContainers: +{{ toYaml .Values.alertmanager.alertmanagerSpec.initContainers | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.priorityClassName }} + priorityClassName: {{.Values.alertmanager.alertmanagerSpec.priorityClassName }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.additionalPeers }} + additionalPeers: +{{ toYaml .Values.alertmanager.alertmanagerSpec.additionalPeers | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.volumes }} + volumes: +{{ toYaml .Values.alertmanager.alertmanagerSpec.volumes | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.volumeMounts }} + volumeMounts: +{{ toYaml .Values.alertmanager.alertmanagerSpec.volumeMounts | indent 4 }} +{{- end }} + portName: {{ .Values.alertmanager.alertmanagerSpec.portName }} +{{- if .Values.alertmanager.alertmanagerSpec.clusterAdvertiseAddress }} + clusterAdvertiseAddress: {{ .Values.alertmanager.alertmanagerSpec.clusterAdvertiseAddress }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.clusterGossipInterval }} + clusterGossipInterval: {{ .Values.alertmanager.alertmanagerSpec.clusterGossipInterval }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.clusterPeerTimeout }} + clusterPeerTimeout: {{ .Values.alertmanager.alertmanagerSpec.clusterPeerTimeout }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.clusterPushpullInterval }} + clusterPushpullInterval: {{ .Values.alertmanager.alertmanagerSpec.clusterPushpullInterval }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.forceEnableClusterMode }} + forceEnableClusterMode: {{ .Values.alertmanager.alertmanagerSpec.forceEnableClusterMode }} +{{- end }} +{{- if .Values.alertmanager.alertmanagerSpec.minReadySeconds }} + minReadySeconds: {{ .Values.alertmanager.alertmanagerSpec.minReadySeconds }} +{{- end }} +{{- with .Values.alertmanager.alertmanagerSpec.additionalConfig }} + {{- tpl (toYaml .) $ | nindent 2 }} +{{- end }} +{{- with .Values.alertmanager.alertmanagerSpec.additionalConfigString }} + {{- tpl . $ | nindent 2 }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/extrasecret.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/extrasecret.yaml new file mode 100644 index 00000000..af469be7 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/extrasecret.yaml @@ -0,0 +1,20 @@ +{{- if .Values.alertmanager.extraSecret.data -}} +{{- $secretName := printf "alertmanager-%s-extra" (include "project-prometheus-stack.fullname" . ) -}} +apiVersion: v1 +kind: Secret +metadata: + name: {{ default $secretName .Values.alertmanager.extraSecret.name }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- if .Values.alertmanager.extraSecret.annotations }} + annotations: +{{ toYaml .Values.alertmanager.extraSecret.annotations | indent 4 }} +{{- end }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager + app.kubernetes.io/component: alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +data: +{{- range $key, $val := .Values.alertmanager.extraSecret.data }} + {{ $key }}: {{ $val | b64enc | quote }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/ingress.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/ingress.yaml new file mode 100644 index 00000000..caed37c6 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/ingress.yaml @@ -0,0 +1,78 @@ +{{- if and .Values.alertmanager.enabled .Values.alertmanager.ingress.enabled }} +{{- $pathType := .Values.alertmanager.ingress.pathType | default "ImplementationSpecific" }} +{{- $serviceName := printf "%s-%s" (include "project-prometheus-stack.fullname" .) "alertmanager" }} +{{- $backendServiceName := .Values.alertmanager.ingress.serviceName | default (printf "%s-%s" (include "project-prometheus-stack.fullname" .) "alertmanager") }} +{{- $servicePort := .Values.alertmanager.ingress.servicePort | default .Values.alertmanager.service.port -}} +{{- $routePrefix := list .Values.alertmanager.alertmanagerSpec.routePrefix }} +{{- $paths := .Values.alertmanager.ingress.paths | default $routePrefix -}} +{{- $apiIsStable := eq (include "project-prometheus-stack.ingress.isStable" .) "true" -}} +{{- $ingressSupportsPathType := eq (include "project-prometheus-stack.ingress.supportsPathType" .) "true" -}} +apiVersion: {{ include "project-prometheus-stack.ingress.apiVersion" . }} +kind: Ingress +metadata: + name: {{ $serviceName }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- if .Values.alertmanager.ingress.annotations }} + annotations: +{{ toYaml .Values.alertmanager.ingress.annotations | indent 4 }} +{{- end }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{- if .Values.alertmanager.ingress.labels }} +{{ toYaml .Values.alertmanager.ingress.labels | indent 4 }} +{{- end }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +spec: + {{- if $apiIsStable }} + {{- if .Values.alertmanager.ingress.ingressClassName }} + ingressClassName: {{ .Values.alertmanager.ingress.ingressClassName }} + {{- end }} + {{- end }} + rules: + {{- if .Values.alertmanager.ingress.hosts }} + {{- range $host := .Values.alertmanager.ingress.hosts }} + - host: {{ tpl $host $ | quote }} + http: + paths: + {{- range $p := $paths }} + - path: {{ tpl $p $ }} + {{- if and $pathType $ingressSupportsPathType }} + pathType: {{ $pathType }} + {{- end }} + backend: + {{- if $apiIsStable }} + service: + name: {{ $backendServiceName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $backendServiceName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- end -}} + {{- end -}} + {{- else }} + - http: + paths: + {{- range $p := $paths }} + - path: {{ tpl $p $ }} + {{- if and $pathType $ingressSupportsPathType }} + pathType: {{ $pathType }} + {{- end }} + backend: + {{- if $apiIsStable }} + service: + name: {{ $backendServiceName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $backendServiceName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- end -}} + {{- end -}} + {{- if .Values.alertmanager.ingress.tls }} + tls: +{{ tpl (toYaml .Values.alertmanager.ingress.tls | indent 4) . }} + {{- end -}} +{{- end -}} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/podDisruptionBudget.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/podDisruptionBudget.yaml new file mode 100644 index 00000000..a8d7feaf --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/podDisruptionBudget.yaml @@ -0,0 +1,21 @@ +{{- if and .Values.alertmanager.enabled .Values.alertmanager.podDisruptionBudget.enabled }} +apiVersion: {{ include "project-prometheus-stack.pdb.apiVersion" . }} +kind: PodDisruptionBudget +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +spec: + {{- if .Values.alertmanager.podDisruptionBudget.minAvailable }} + minAvailable: {{ .Values.alertmanager.podDisruptionBudget.minAvailable }} + {{- end }} + {{- if .Values.alertmanager.podDisruptionBudget.maxUnavailable }} + maxUnavailable: {{ .Values.alertmanager.podDisruptionBudget.maxUnavailable }} + {{- end }} + selector: + matchLabels: + app.kubernetes.io/name: alertmanager + alertmanager: {{ template "project-prometheus-stack.alertmanager.crname" . }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-role.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-role.yaml new file mode 100644 index 00000000..d464167e --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-role.yaml @@ -0,0 +1,23 @@ +{{- if and .Values.alertmanager.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +kind: Role +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +rules: +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if semverCompare "> 1.15.0-0" $kubeTargetVersion }} +- apiGroups: ['policy'] +{{- else }} +- apiGroups: ['extensions'] +{{- end }} + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: + - {{ template "project-prometheus-stack.fullname" . }}-alertmanager +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-rolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-rolebinding.yaml new file mode 100644 index 00000000..5d2a6f1a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp-rolebinding.yaml @@ -0,0 +1,20 @@ +{{- if and .Values.alertmanager.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: RoleBinding +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: Role + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager +subjects: + - kind: ServiceAccount + name: {{ template "project-prometheus-stack.alertmanager.serviceAccountName" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp.yaml new file mode 100644 index 00000000..8dd2c70d --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/psp.yaml @@ -0,0 +1,47 @@ +{{- if and .Values.alertmanager.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{- if .Values.global.rbac.pspAnnotations }} + annotations: +{{ toYaml .Values.global.rbac.pspAnnotations | indent 4 }} +{{- end }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +spec: + privileged: false + # Allow core volume types. + volumes: + - 'configMap' + - 'emptyDir' + - 'projected' + - 'secret' + - 'downwardAPI' + - 'persistentVolumeClaim' + hostNetwork: false + hostIPC: false + hostPID: false + runAsUser: + # Permits the container to run with root privileges as well. + rule: 'RunAsAny' + seLinux: + # This policy assumes the nodes are using AppArmor rather than SELinux. + rule: 'RunAsAny' + supplementalGroups: + rule: 'MustRunAs' + ranges: + # Allow adding the root group. + - min: 0 + max: 65535 + fsGroup: + rule: 'MustRunAs' + ranges: + # Allow adding the root group. + - min: 0 + max: 65535 + readOnlyRootFilesystem: false +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/secret.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/secret.yaml new file mode 100644 index 00000000..b7393645 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/secret.yaml @@ -0,0 +1,35 @@ +{{- if and (.Values.alertmanager.enabled) (not .Values.alertmanager.alertmanagerSpec.useExistingSecret) }} +{{/* This file is applied when the operation is helm install and the target secret does not exist. */}} +{{- $secretName := (printf "alertmanager-%s" (include "project-prometheus-stack.alertmanager.crname" .)) }} +{{- if (not (lookup "v1" "Secret" (include "project-prometheus-stack.namespace" .) $secretName)) }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ $secretName }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + annotations: + "helm.sh/hook": pre-install, pre-upgrade + "helm.sh/hook-weight": "3" + "helm.sh/resource-policy": keep +{{- if .Values.alertmanager.secret.annotations }} +{{ toYaml .Values.alertmanager.secret.annotations | indent 4 }} +{{- end }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +data: +{{- if .Values.alertmanager.tplConfig }} +{{- if .Values.alertmanager.stringConfig }} + alertmanager.yaml: {{ tpl (.Values.alertmanager.stringConfig) . | b64enc | quote }} +{{- else if eq (typeOf .Values.alertmanager.config) "string" }} + alertmanager.yaml: {{ tpl (.Values.alertmanager.config) . | b64enc | quote }} +{{- else }} + alertmanager.yaml: {{ tpl (toYaml .Values.alertmanager.config) . | b64enc | quote }} +{{- end }} +{{- else }} + alertmanager.yaml: {{ toYaml .Values.alertmanager.config | b64enc | quote }} +{{- end }} +{{- range $key, $val := .Values.alertmanager.templateFiles }} + {{ $key }}: {{ $val | b64enc | quote }} +{{- end }} +{{- end }}{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/service.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/service.yaml new file mode 100644 index 00000000..e75f3d99 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/service.yaml @@ -0,0 +1,68 @@ +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if .Values.alertmanager.enabled }} +apiVersion: v1 +kind: Service +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager + self-monitor: {{ .Values.alertmanager.serviceMonitor.selfMonitor | quote }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.alertmanager.service.labels }} +{{ toYaml .Values.alertmanager.service.labels | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.service.annotations }} + annotations: +{{ toYaml .Values.alertmanager.service.annotations | indent 4 }} +{{- end }} +spec: +{{- if .Values.alertmanager.service.clusterIP }} + clusterIP: {{ .Values.alertmanager.service.clusterIP }} +{{- end }} +{{- if .Values.alertmanager.service.externalIPs }} + externalIPs: +{{ toYaml .Values.alertmanager.service.externalIPs | indent 4 }} +{{- end }} +{{- if .Values.alertmanager.service.loadBalancerIP }} + loadBalancerIP: {{ .Values.alertmanager.service.loadBalancerIP }} +{{- end }} +{{- if .Values.alertmanager.service.loadBalancerSourceRanges }} + loadBalancerSourceRanges: + {{- range $cidr := .Values.alertmanager.service.loadBalancerSourceRanges }} + - {{ $cidr }} + {{- end }} +{{- end }} +{{- if ne .Values.alertmanager.service.type "ClusterIP" }} + externalTrafficPolicy: {{ .Values.alertmanager.service.externalTrafficPolicy }} +{{- end }} + ports: + - name: {{ .Values.alertmanager.alertmanagerSpec.portName }} + {{- if eq .Values.alertmanager.service.type "NodePort" }} + nodePort: {{ .Values.alertmanager.service.nodePort }} + {{- end }} + port: {{ .Values.alertmanager.service.port }} + targetPort: {{ .Values.alertmanager.service.targetPort }} + protocol: TCP + - name: reloader-web + {{- if semverCompare ">=1.20.0-0" $kubeTargetVersion }} + appProtocol: http + {{- end }} + port: 8080 + targetPort: reloader-web +{{- if .Values.alertmanager.service.additionalPorts }} +{{ toYaml .Values.alertmanager.service.additionalPorts | indent 2 }} +{{- end }} + selector: + app.kubernetes.io/name: alertmanager + alertmanager: {{ template "project-prometheus-stack.alertmanager.crname" . }} +{{- if .Values.alertmanager.service.sessionAffinity }} + sessionAffinity: {{ .Values.alertmanager.service.sessionAffinity }} +{{- end }} +{{- if eq .Values.alertmanager.service.sessionAffinity "ClientIP" }} + sessionAffinityConfig: + clientIP: + timeoutSeconds: {{ .Values.alertmanager.service.sessionAffinityConfig.clientIP.timeoutSeconds }} +{{- end }} + type: "{{ .Values.alertmanager.service.type }}" +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/serviceaccount.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/serviceaccount.yaml new file mode 100644 index 00000000..ffde9bff --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/serviceaccount.yaml @@ -0,0 +1,21 @@ +{{- if and .Values.alertmanager.enabled .Values.alertmanager.serviceAccount.create }} +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ template "project-prometheus-stack.alertmanager.serviceAccountName" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager + app.kubernetes.io/name: {{ template "project-prometheus-stack.name" . }}-alertmanager + app.kubernetes.io/component: alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.alertmanager.serviceAccount.annotations }} + annotations: +{{ toYaml .Values.alertmanager.serviceAccount.annotations | indent 4 }} +{{- end }} +automountServiceAccountToken: {{ .Values.alertmanager.serviceAccount.automountServiceAccountToken }} +{{- if .Values.global.imagePullSecrets }} +imagePullSecrets: +{{ include "project-prometheus-stack.imagePullSecrets" . | trim | indent 2}} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/servicemonitor.yaml b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/servicemonitor.yaml new file mode 100644 index 00000000..95b97ea4 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/alertmanager/servicemonitor.yaml @@ -0,0 +1,84 @@ +{{- if and .Values.alertmanager.enabled .Values.alertmanager.serviceMonitor.selfMonitor }} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- with .Values.alertmanager.serviceMonitor.additionalLabels }} +{{- toYaml . | nindent 4 }} +{{- end }} +spec: + {{- include "servicemonitor.scrapeLimits" .Values.alertmanager.serviceMonitor | nindent 2 }} + selector: + matchLabels: + app: {{ template "project-prometheus-stack.name" . }}-alertmanager + release: {{ $.Release.Name | quote }} + self-monitor: "true" + namespaceSelector: + matchNames: + - {{ printf "%s" (include "project-prometheus-stack.namespace" .) | quote }} + endpoints: + - port: {{ .Values.alertmanager.alertmanagerSpec.portName }} + enableHttp2: {{ .Values.alertmanager.serviceMonitor.enableHttp2 }} + {{- if .Values.alertmanager.serviceMonitor.interval }} + interval: {{ .Values.alertmanager.serviceMonitor.interval }} + {{- end }} + {{- if .Values.alertmanager.serviceMonitor.proxyUrl }} + proxyUrl: {{ .Values.alertmanager.serviceMonitor.proxyUrl }} + {{- end }} + {{- if .Values.alertmanager.serviceMonitor.scheme }} + scheme: {{ .Values.alertmanager.serviceMonitor.scheme }} + {{- end }} + {{- if .Values.alertmanager.serviceMonitor.bearerTokenFile }} + bearerTokenFile: {{ .Values.alertmanager.serviceMonitor.bearerTokenFile }} + {{- end }} + {{- if .Values.alertmanager.serviceMonitor.tlsConfig }} + tlsConfig: {{- toYaml .Values.alertmanager.serviceMonitor.tlsConfig | nindent 6 }} + {{- end }} + path: "{{ trimSuffix "/" .Values.alertmanager.alertmanagerSpec.routePrefix }}/metrics" + metricRelabelings: + {{- if .Values.alertmanager.serviceMonitor.metricRelabelings }} + {{- tpl (toYaml .Values.alertmanager.serviceMonitor.metricRelabelings | nindent 6) . }} + {{- end }} + {{ if .Values.global.cattle.clusterId }} + - sourceLabels: [__address__] + targetLabel: cluster_id + replacement: {{ .Values.global.cattle.clusterId }} + {{- end }} + {{ if .Values.global.cattle.clusterName }} + - sourceLabels: [__address__] + targetLabel: cluster_name + replacement: {{ .Values.global.cattle.clusterName }} + {{- end }} + {{- if .Values.alertmanager.serviceMonitor.relabelings }} + relabelings: {{- toYaml .Values.alertmanager.serviceMonitor.relabelings | nindent 6 }} + {{- end }} + {{- range .Values.alertmanager.serviceMonitor.additionalEndpoints }} + - port: {{ .port }} + {{- if or $.Values.alertmanager.serviceMonitor.interval .interval }} + interval: {{ default $.Values.alertmanager.serviceMonitor.interval .interval }} + {{- end }} + {{- if or $.Values.alertmanager.serviceMonitor.proxyUrl .proxyUrl }} + proxyUrl: {{ default $.Values.alertmanager.serviceMonitor.proxyUrl .proxyUrl }} + {{- end }} + {{- if or $.Values.alertmanager.serviceMonitor.scheme .scheme }} + scheme: {{ default $.Values.alertmanager.serviceMonitor.scheme .scheme }} + {{- end }} + {{- if or $.Values.alertmanager.serviceMonitor.bearerTokenFile .bearerTokenFile }} + bearerTokenFile: {{ default $.Values.alertmanager.serviceMonitor.bearerTokenFile .bearerTokenFile }} + {{- end }} + {{- if or $.Values.alertmanager.serviceMonitor.tlsConfig .tlsConfig }} + tlsConfig: {{- default $.Values.alertmanager.serviceMonitor.tlsConfig .tlsConfig | toYaml | nindent 6 }} + {{- end }} + path: {{ .path }} + {{- if or $.Values.alertmanager.serviceMonitor.metricRelabelings .metricRelabelings }} + metricRelabelings: {{- tpl (default $.Values.alertmanager.serviceMonitor.metricRelabelings .metricRelabelings | toYaml | nindent 6) . }} + {{- end }} + {{- if or $.Values.alertmanager.serviceMonitor.relabelings .relabelings }} + relabelings: {{- default $.Values.alertmanager.serviceMonitor.relabelings .relabelings | toYaml | nindent 6 }} + {{- end }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/dashboard-roles.yaml b/charts/rancher-project-monitoring/0.4.2/templates/dashboard-roles.yaml new file mode 100644 index 00000000..73a57098 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/dashboard-roles.yaml @@ -0,0 +1,180 @@ +{{- if and .Values.global.rbac.create .Values.global.rbac.userRoles.create }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-admin + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: {{ include "project-prometheus-stack.labels" . | nindent 4 }} + helm.cattle.io/project-helm-chart-role: {{ .Release.Name }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + helm.cattle.io/project-helm-chart-role-aggregate-from: admin + {{- end }} +rules: +- apiGroups: + - "" + resources: + - services/proxy + resourceNames: + - "http:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" +{{- if .Values.alertmanager.enabled }} + - "http:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" +{{- end }} +{{- if .Values.grafana.enabled }} + - "http:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" + - "https:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" +{{- end }} + verbs: + - 'get' + - 'create' + - 'update' + - 'patch' + - 'delete' +- apiGroups: + - "" + resources: + - endpoints + resourceNames: + - {{ template "project-prometheus-stack.fullname" . }}-prometheus +{{- if .Values.alertmanager.enabled }} + - {{ template "project-prometheus-stack.fullname" . }}-alertmanager +{{- end }} +{{- if .Values.grafana.enabled }} + - {{ include "call-nested" (list . "grafana" "grafana.fullname") }} +{{- end }} + verbs: + - list +{{- if .Values.alertmanager.enabled }} +- apiGroups: + - "" + resources: + - "secrets" + resourceNames: + - {{ printf "%s-alertmanager-secret" (include "project-prometheus-stack.fullname" .) }} + verbs: + - get + - list + - watch + - update + - patch + - delete +{{- end }} +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-edit + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: {{ include "project-prometheus-stack.labels" . | nindent 4 }} + helm.cattle.io/project-helm-chart-role: {{ .Release.Name }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + helm.cattle.io/project-helm-chart-role-aggregate-from: edit + {{- end }} +rules: +- apiGroups: + - "" + resources: + - services/proxy + resourceNames: + - "http:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" +{{- if .Values.alertmanager.enabled }} + - "http:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" +{{- end }} +{{- if .Values.grafana.enabled }} + - "http:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" + - "https:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" +{{- end }} + verbs: + - 'get' + - 'create' + - 'update' + - 'patch' + - 'delete' +- apiGroups: + - "" + resources: + - endpoints + resourceNames: + - {{ template "project-prometheus-stack.fullname" . }}-prometheus +{{- if .Values.alertmanager.enabled }} + - {{ template "project-prometheus-stack.fullname" . }}-alertmanager +{{- end }} +{{- if .Values.grafana.enabled }} + - {{ include "call-nested" (list . "grafana" "grafana.fullname") }} +{{- end }} + verbs: + - list +{{- if .Values.alertmanager.enabled }} +- apiGroups: + - "" + resources: + - "secrets" + resourceNames: + - {{ printf "%s-alertmanager-secret" (include "project-prometheus-stack.fullname" .) }} + verbs: + - get + - list + - watch + - update + - patch +{{- end }} +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: Role +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-view + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: {{ include "project-prometheus-stack.labels" . | nindent 4 }} + helm.cattle.io/project-helm-chart-role: {{ .Release.Name }} + {{- if .Values.global.rbac.userRoles.aggregateToDefaultRoles }} + helm.cattle.io/project-helm-chart-role-aggregate-from: view + {{- end }} +rules: +- apiGroups: + - "" + resources: + - services/proxy + resourceNames: + - "http:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}" +{{- if .Values.alertmanager.enabled }} + - "http:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" + - "https:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}" +{{- end }} +{{- if .Values.grafana.enabled }} + - "http:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" + - "https:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}" +{{- end }} + verbs: + - 'get' + - 'create' +- apiGroups: + - "" + resources: + - endpoints + resourceNames: + - {{ template "project-prometheus-stack.fullname" . }}-prometheus +{{- if .Values.alertmanager.enabled }} + - {{ template "project-prometheus-stack.fullname" . }}-alertmanager +{{- end }} +{{- if .Values.grafana.enabled }} + - {{ include "call-nested" (list . "grafana" "grafana.fullname") }} +{{- end }} + verbs: + - list +{{- if .Values.alertmanager.enabled }} +- apiGroups: + - "" + resources: + - "secrets" + resourceNames: + - {{ printf "%s-alertmanager-secret" (include "project-prometheus-stack.fullname" .) }} + verbs: + - get + - list + - watch +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/dashboard-values-configmap.yaml b/charts/rancher-project-monitoring/0.4.2/templates/dashboard-values-configmap.yaml new file mode 100644 index 00000000..a51f50d2 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/dashboard-values-configmap.yaml @@ -0,0 +1,61 @@ +{{- $rancherDashboards := dict }} +{{- range $glob := tuple "files/rancher/workloads/*" "files/rancher/pods/*" -}} +{{- range $dashboard, $_ := ($.Files.Glob $glob) }} +{{- $dashboardMap := ($.Files.Get $dashboard | fromJson) }} +{{- $_ := set $rancherDashboards (get $dashboardMap "uid") (get $dashboardMap "title") -}} +{{- end }} +{{- end }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-dashboard-values + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: {{ include "project-prometheus-stack.labels" . | nindent 4 }} + helm.cattle.io/dashboard-values-configmap: {{ .Release.Name }} +data: + values.json: |- + { +{{- if not .Values.prometheus.enabled }} + "prometheusURL": "", +{{- else if .Values.prometheus.prometheusSpec.externalUrl }} + "prometheusURL": "{{ tpl .Values.prometheus.prometheusSpec.externalUrl . }}", +{{- else if and .Values.prometheus.ingress.enabled .Values.prometheus.ingress.hosts }} + "prometheusURL": "http://{{ tpl (index .Values.prometheus.ingress.hosts 0) . }}{{ .Values.prometheus.prometheusSpec.routePrefix }}", +{{- else if not (or (empty .Values.global.cattle.url) (empty .Values.global.cattle.clusterId)) }} + "prometheusURL": "{{ .Values.global.cattle.url }}/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "project-prometheus-stack.namespace" . }}/services/http:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}/proxy", +{{- else }} + "prometheusURL": "http://{{ template "project-prometheus-stack.fullname" . }}-prometheus.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.prometheus.service.port }}", +{{- end }} + +{{- if not .Values.grafana.enabled }} + "grafanaURL": "", +{{- else if and .Values.grafana.ingress.enabled .Values.grafana.ingress.hosts }} + "grafanaURL": "http://{{ tpl (index .Values.grafana.ingress.hosts 0) . }}{{ .Values.grafana.grafanaSpec.ingress.path }}", +{{- else if not (or (empty .Values.global.cattle.url) (empty .Values.global.cattle.clusterId)) }} + {{- if eq .Values.global.cattle.clusterId "local" -}} + "grafanaURL": "{{ .Values.global.cattle.url }}/api/v1/namespaces/{{ template "project-prometheus-stack.namespace" . }}/services/http:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}/proxy", + {{- else }} + "grafanaURL": "{{ .Values.global.cattle.url }}/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "project-prometheus-stack.namespace" . }}/services/http:{{ include "call-nested" (list . "grafana" "grafana.fullname") }}:{{ .Values.grafana.service.port }}/proxy", + {{- end }} +{{- else }} + "grafanaURL": "http://{{ include "call-nested" (list . "grafana" "grafana.fullname") }}.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.grafana.service.port }}", +{{- end }} + +{{- if not .Values.alertmanager.enabled }} + "alertmanagerURL": "", +{{- else if .Values.alertmanager.alertmanagerSpec.externalUrl }} + "alertmanagerURL": "{{ tpl .Values.alertmanager.alertmanagerSpec.externalUrl . }}", +{{- else if and .Values.alertmanager.ingress.enabled .Values.alertmanager.ingress.hosts }} + "alertmanagerURL": "http://{{ tpl (index .Values.alertmanager.ingress.hosts 0) . }}{{ .Values.alertmanager.alertmanagerSpec.routePrefix }}", +{{- else if not (or (empty .Values.global.cattle.url) (empty .Values.global.cattle.clusterId)) }} + "alertmanagerURL": "{{ .Values.global.cattle.url }}/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "project-prometheus-stack.namespace" . }}/services/http:{{ template "project-prometheus-stack.fullname" . }}-alertmanager:{{ .Values.alertmanager.service.port }}/proxy", +{{- else }} + "alertmanagerURL": "http://{{ template "project-prometheus-stack.fullname" . }}-alertmanager.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.alertmanager.service.port }}", +{{- end }} + +{{- if and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled }} + "rancherDashboards": {{- $rancherDashboards | toPrettyJson | nindent 8 }} +{{- else }} + "rancherDashboards": [] +{{- end }} + } \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/pods-dashboards.yaml b/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/pods-dashboards.yaml new file mode 100644 index 00000000..6214dc5c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/pods-dashboards.yaml @@ -0,0 +1,17 @@ +{{- if and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack.namespace" . }} + name: rancher-default-dashboards-pods + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: +{{ (.Files.Glob "files/rancher/pods/*").AsConfig | indent 2 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/workload-dashboards.yaml b/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/workload-dashboards.yaml new file mode 100644 index 00000000..7934818f --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/dashboards/rancher/workload-dashboards.yaml @@ -0,0 +1,17 @@ +{{- if and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack.namespace" . }} + name: rancher-default-dashboards-workloads + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: +{{ (.Files.Glob "files/rancher/workloads/*").AsConfig | indent 2 }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/extra-objects.yaml b/charts/rancher-project-monitoring/0.4.2/templates/extra-objects.yaml new file mode 100644 index 00000000..567f7bf3 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/extra-objects.yaml @@ -0,0 +1,4 @@ +{{ range .Values.extraManifests }} +--- +{{ tpl (toYaml .) $ }} +{{ end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmap-dashboards.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmap-dashboards.yaml new file mode 100644 index 00000000..9e4379fd --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmap-dashboards.yaml @@ -0,0 +1,24 @@ +{{- if or (and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled) .Values.grafana.forceDeployDashboards }} +{{- $files := .Files.Glob "dashboards-1.14/*.json" }} +{{- if $files }} +apiVersion: v1 +kind: ConfigMapList +items: +{{- range $path, $fileContents := $files }} +{{- $dashboardName := regexReplaceAll "(^.*/)(.*)\\.json$" $path "${2}" }} +- apiVersion: v1 + kind: ConfigMap + metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) $dashboardName | trunc 63 | trimSuffix "-" }} + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 6 }} + data: + {{ $dashboardName }}.json: {{ $.Files.Get $path | toJson }} +{{- end }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmaps-datasources.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmaps-datasources.yaml new file mode 100644 index 00000000..d4a33306 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/configmaps-datasources.yaml @@ -0,0 +1,81 @@ +{{- if or (and .Values.grafana.enabled .Values.grafana.sidecar.datasources.enabled) .Values.grafana.forceDeployDatasources }} +apiVersion: v1 +kind: ConfigMap +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-grafana-datasource + namespace: {{ default .Values.grafana.sidecar.datasources.searchNamespace (include "project-prometheus-stack.namespace" .) }} +{{- if .Values.grafana.sidecar.datasources.annotations }} + annotations: + {{- toYaml .Values.grafana.sidecar.datasources.annotations | nindent 4 }} +{{- end }} + labels: + {{ $.Values.grafana.sidecar.datasources.label }}: {{ $.Values.grafana.sidecar.datasources.labelValue | quote }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + datasource.yaml: |- + apiVersion: 1 +{{- if .Values.grafana.deleteDatasources }} + deleteDatasources: +{{ tpl (toYaml .Values.grafana.deleteDatasources | indent 6) . }} +{{- end }} + datasources: +{{- $scrapeInterval := .Values.grafana.sidecar.datasources.defaultDatasourceScrapeInterval | default .Values.prometheus.prometheusSpec.scrapeInterval | default "30s" }} +{{- if .Values.grafana.sidecar.datasources.defaultDatasourceEnabled }} + - name: Prometheus + type: prometheus + uid: {{ .Values.grafana.sidecar.datasources.uid }} + {{- if .Values.grafana.sidecar.datasources.url }} + url: {{ .Values.grafana.sidecar.datasources.url }} + {{- else }} + url: http://{{ template "project-prometheus-stack.fullname" . }}-prometheus.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.prometheus.service.port }}/{{ trimPrefix "/" .Values.prometheus.prometheusSpec.routePrefix }} + {{- end }} + access: proxy + isDefault: {{ .Values.grafana.sidecar.datasources.isDefaultDatasource }} + jsonData: + httpMethod: {{ .Values.grafana.sidecar.datasources.httpMethod }} + timeInterval: {{ $scrapeInterval }} + {{- if .Values.grafana.sidecar.datasources.timeout }} + timeout: {{ .Values.grafana.sidecar.datasources.timeout }} + {{- end }} +{{- if .Values.grafana.sidecar.datasources.exemplarTraceIdDestinations }} + exemplarTraceIdDestinations: + - datasourceUid: {{ .Values.grafana.sidecar.datasources.exemplarTraceIdDestinations.datasourceUid }} + name: {{ .Values.grafana.sidecar.datasources.exemplarTraceIdDestinations.traceIdLabelName }} +{{- end }} +{{- if .Values.grafana.sidecar.datasources.createPrometheusReplicasDatasources }} +{{- range until (int .Values.prometheus.prometheusSpec.replicas) }} + - name: Prometheus-{{ . }} + type: prometheus + uid: {{ $.Values.grafana.sidecar.datasources.uid }}-replica-{{ . }} + url: http://prometheus-{{ template "project-prometheus-stack.prometheus.crname" $ }}-{{ . }}.prometheus-operated:9090/{{ trimPrefix "/" $.Values.prometheus.prometheusSpec.routePrefix }} + access: proxy + isDefault: false + jsonData: + timeInterval: {{ $scrapeInterval }} +{{- if $.Values.grafana.sidecar.datasources.exemplarTraceIdDestinations }} + exemplarTraceIdDestinations: + - datasourceUid: {{ $.Values.grafana.sidecar.datasources.exemplarTraceIdDestinations.datasourceUid }} + name: {{ $.Values.grafana.sidecar.datasources.exemplarTraceIdDestinations.traceIdLabelName }} +{{- end }} +{{- end }} +{{- end }} +{{- if .Values.grafana.sidecar.datasources.alertmanager.enabled }} + - name: Alertmanager + type: alertmanager + uid: {{ .Values.grafana.sidecar.datasources.alertmanager.uid }} + {{- if .Values.grafana.sidecar.datasources.alertmanager.url }} + url: {{ .Values.grafana.sidecar.datasources.alertmanager.url }} + {{- else }} + url: http://{{ template "project-prometheus-stack.fullname" . }}-alertmanager.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.alertmanager.service.port }}/{{ trimPrefix "/" .Values.alertmanager.alertmanagerSpec.routePrefix }} + {{- end }} + access: proxy + jsonData: + handleGrafanaManagedAlerts: {{ .Values.grafana.sidecar.datasources.alertmanager.handleGrafanaManagedAlerts }} + implementation: {{ .Values.grafana.sidecar.datasources.alertmanager.implementation }} +{{- end }} +{{- end }} +{{- if .Values.grafana.additionalDataSources }} +{{ tpl (toYaml .Values.grafana.additionalDataSources | indent 4) . }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/alertmanager-overview.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/alertmanager-overview.yaml new file mode 100644 index 00000000..987b19f2 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/alertmanager-overview.yaml @@ -0,0 +1,616 @@ +{{- /* +Generated from 'alertmanager-overview' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +{{- if and .Values.alertmanager.enabled .Values.alertmanager.serviceMonitor.selfMonitor }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "alertmanager-overview" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + alertmanager-overview.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + + ] + }, + "editable": false, + "gnetId": null, + "graphTooltip": 1, + "hideControls": false, + "id": null, + "links": [ + + ], + "refresh": "30s", + "rows": [ + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "current set of alerts stored in the Alertmanager", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 2, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(alertmanager_alerts{namespace=~\"$namespace\",service=~\"$service\"}) by (namespace,service,instance)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Alerts", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "rate of successful and invalid alerts received by the Alertmanager", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 3, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(alertmanager_alerts_received_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval])) by (namespace,service,instance)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Received", + "refId": "A" + }, + { + "expr": "sum(rate(alertmanager_alerts_invalid_total{namespace=~\"$namespace\",service=~\"$service\"}[$__rate_interval])) by (namespace,service,instance)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Invalid", + "refId": "B" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Alerts receive rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Alerts", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "rate of successful and invalid notifications sent by the Alertmanager", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 4, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": "integration", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(alertmanager_notifications_total{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Total", + "refId": "A" + }, + { + "expr": "sum(rate(alertmanager_notifications_failed_total{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (integration,namespace,service,instance)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Failed", + "refId": "B" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "$integration: Notifications Send Rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "ops", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "latency of notifications sent by the Alertmanager", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 5, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": false, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": "integration", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(0.99,\n sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n) \n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} 99th Percentile", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.50,\n sum(rate(alertmanager_notification_latency_seconds_bucket{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (le,namespace,service,instance)\n) \n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Median", + "refId": "B" + }, + { + "expr": "sum(rate(alertmanager_notification_latency_seconds_sum{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n/\nsum(rate(alertmanager_notification_latency_seconds_count{namespace=~\"$namespace\",service=~\"$service\", integration=\"$integration\"}[$__rate_interval])) by (namespace,service,instance)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}instance{{`}}`}} Average", + "refId": "C" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "$integration: Notification Duration", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Notifications", + "titleSize": "h6", + "type": "row" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "alertmanager-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": "namespace", + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(alertmanager_alerts, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": "service", + "multi": false, + "name": "service", + "options": [ + + ], + "query": "label_values(alertmanager_alerts, service)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "all", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": true, + "label": null, + "multi": false, + "name": "integration", + "options": [ + + ], + "query": "label_values(alertmanager_notifications_total{integration=~\".*\"}, integration)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Alertmanager / Overview", + "uid": "alertmanager-overview", + "version": 0 + } +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/cluster-total.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/cluster-total.yaml new file mode 100644 index 00000000..1c7541c3 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/cluster-total.yaml @@ -0,0 +1,1646 @@ +{{- /* +Generated from 'cluster-total' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "cluster-total" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + cluster-total.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "panels": [ + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 2, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "id": 3, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 1 + }, + "id": 4, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "columns": [ + { + "text": "Time", + "value": "Time" + }, + { + "text": "Value #A", + "value": "Value #A" + }, + { + "text": "Value #B", + "value": "Value #B" + }, + { + "text": "Value #C", + "value": "Value #C" + }, + { + "text": "Value #D", + "value": "Value #D" + }, + { + "text": "Value #E", + "value": "Value #E" + }, + { + "text": "Value #F", + "value": "Value #F" + }, + { + "text": "Value #G", + "value": "Value #G" + }, + { + "text": "Value #H", + "value": "Value #H" + }, + { + "text": "namespace", + "value": "namespace" + } + ], + "datasource": "$datasource", + "fill": 1, + "fontSize": "90%", + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 5, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null as zero", + "renderer": "flot", + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": false + }, + "spaceLength": 10, + "span": 24, + "styles": [ + { + "alias": "Time", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Time", + "thresholds": [ + + ], + "type": "hidden", + "unit": "short" + }, + { + "alias": "Current Bandwidth Received", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Bandwidth Transmitted", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Average Bandwidth Received", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Average Bandwidth Transmitted", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #H", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Namespace", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTooltip": "Drill down", + "linkUrl": "d/8b7a8b326d7a6f1f04244066368c67af/kubernetes-networking-namespace-pods?orgId=1&refresh=30s&var-namespace=$__cell", + "pattern": "namespace", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sort_desc(avg(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "H", + "step": 10 + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Status", + "type": "table" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 6, + "panels": [ + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 11 + }, + "id": 7, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 11 + }, + "id": 8, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 11 + }, + "id": 9, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth History", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 12 + }, + "id": 10, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 11, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 30 + }, + "id": 12, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 31 + }, + "id": 13, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 40 + }, + "id": 14, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Packets", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 31 + }, + "id": 15, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 50 + }, + "id": 16, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 59 + }, + "id": 17, + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{namespace=~\".+\"}[$interval:$resolution])) by (namespace))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Errors", + "titleSize": "h6", + "type": "row" + } + ], + "refresh": "10s", + "rows": [ + + ], + "schemaVersion": 18, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "resolution", + "options": [ + { + "selected": false, + "text": "30s", + "value": "30s" + }, + { + "selected": true, + "text": "5m", + "value": "5m" + }, + { + "selected": false, + "text": "1h", + "value": "1h" + } + ], + "query": "30s,5m,1h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": false, + "label": null, + "multi": false, + "name": "interval", + "options": [ + { + "selected": true, + "text": "4h", + "value": "4h" + } + ], + "query": "4h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Networking / Project", + "uid": "ff635a025bcfea7bc3dd4f508990a3e9", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/grafana-overview.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/grafana-overview.yaml new file mode 100644 index 00000000..f87eb66a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/grafana-overview.yaml @@ -0,0 +1,635 @@ +{{- /* +Generated from 'grafana-overview' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack-grafana.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "grafana-overview" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + grafana-overview.json: |- + { + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "target": { + "limit": 100, + "matchAny": false, + "tags": [ + + ], + "type": "dashboard" + }, + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 3085, + "iteration": 1631554945276, + "links": [ + + ], + "panels": [ + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "mappings": [ + + ], + "noValue": "0", + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + + ] + }, + "gridPos": { + "h": 5, + "w": 6, + "x": 0, + "y": 0 + }, + "id": 6, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "text": { + + }, + "textMode": "auto" + }, + "pluginVersion": "8.1.3", + "targets": [ + { + "expr": "grafana_alerting_result_total{job=~\"$job\", instance=~\"$instance\", state=\"alerting\"}", + "instant": true, + "interval": "", + "legendFormat": "", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Firing Alerts", + "type": "stat" + }, + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "mappings": [ + + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + + ] + }, + "gridPos": { + "h": 5, + "w": 6, + "x": 6, + "y": 0 + }, + "id": 8, + "options": { + "colorMode": "value", + "graphMode": "area", + "justifyMode": "auto", + "orientation": "auto", + "reduceOptions": { + "calcs": [ + "mean" + ], + "fields": "", + "values": false + }, + "text": { + + }, + "textMode": "auto" + }, + "pluginVersion": "8.1.3", + "targets": [ + { + "expr": "sum(grafana_stat_totals_dashboard{job=~\"$job\", instance=~\"$instance\"})", + "interval": "", + "legendFormat": "", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Dashboards", + "type": "stat" + }, + { + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": { + "align": null, + "displayMode": "auto" + }, + "mappings": [ + + ], + "thresholds": { + "mode": "absolute", + "steps": [ + { + "color": "green", + "value": null + }, + { + "color": "red", + "value": 80 + } + ] + } + }, + "overrides": [ + + ] + }, + "gridPos": { + "h": 5, + "w": 12, + "x": 12, + "y": 0 + }, + "id": 10, + "options": { + "showHeader": true + }, + "pluginVersion": "8.1.3", + "targets": [ + { + "expr": "grafana_build_info{job=~\"$job\", instance=~\"$instance\"}", + "instant": true, + "interval": "", + "legendFormat": "", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Build Info", + "transformations": [ + { + "id": "labelsToFields", + "options": { + + } + }, + { + "id": "organize", + "options": { + "excludeByName": { + "Time": true, + "Value": true, + "branch": true, + "container": true, + "goversion": true, + "namespace": true, + "pod": true, + "revision": true + }, + "indexByName": { + "Time": 7, + "Value": 11, + "branch": 4, + "container": 8, + "edition": 2, + "goversion": 6, + "instance": 1, + "job": 0, + "namespace": 9, + "pod": 10, + "revision": 5, + "version": 3 + }, + "renameByName": { + + } + } + } + ], + "type": "table" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "links": [ + + ] + }, + "overrides": [ + + ] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 5 + }, + "hiddenSeries": false, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "8.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by (status_code) (irate(grafana_http_request_duration_seconds_count{job=~\"$job\", instance=~\"$instance\"}[1m])) ", + "interval": "", + "legendFormat": "{{`{{`}}status_code{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeRegions": [ + + ], + "timeShift": null, + "title": "RPS", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "$$hashKey": "object:157", + "format": "reqps", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "$$hashKey": "object:158", + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "links": [ + + ] + }, + "overrides": [ + + ] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 5 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "8.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "exemplar": true, + "expr": "histogram_quantile(0.99, sum(irate(grafana_http_request_duration_seconds_bucket{instance=~\"$instance\", job=~\"$job\"}[$__rate_interval])) by (le)) * 1", + "interval": "", + "legendFormat": "99th Percentile", + "refId": "A" + }, + { + "exemplar": true, + "expr": "histogram_quantile(0.50, sum(irate(grafana_http_request_duration_seconds_bucket{instance=~\"$instance\", job=~\"$job\"}[$__rate_interval])) by (le)) * 1", + "interval": "", + "legendFormat": "50th Percentile", + "refId": "B" + }, + { + "exemplar": true, + "expr": "sum(irate(grafana_http_request_duration_seconds_sum{instance=~\"$instance\", job=~\"$job\"}[$__rate_interval])) * 1 / sum(irate(grafana_http_request_duration_seconds_count{instance=~\"$instance\", job=~\"$job\"}[$__rate_interval]))", + "interval": "", + "legendFormat": "Average", + "refId": "C" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeRegions": [ + + ], + "timeShift": null, + "title": "Request Latency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "$$hashKey": "object:210", + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "$$hashKey": "object:211", + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "schemaVersion": 30, + "style": "dark", + "tags": [ + + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "description": null, + "error": null, + "hide": 0, + "includeAll": false, + "label": "Data Source", + "multi": false, + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "queryValue": "", + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "type": "datasource" + }, + { + "allValue": ".*", + "current": { + "selected": false, + "text": [ + "default/grafana" + ], + "value": [ + "default/grafana" + ] + }, + "datasource": "$datasource", + "definition": "label_values(grafana_build_info, job)", + "description": null, + "error": null, + "hide": 0, + "includeAll": true, + "label": null, + "multi": true, + "name": "job", + "options": [ + + ], + "query": { + "query": "label_values(grafana_build_info, job)", + "refId": "Billing Admin-job-Variable-Query" + }, + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".*", + "current": { + "selected": false, + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "definition": "label_values(grafana_build_info, instance)", + "description": null, + "error": null, + "hide": 0, + "includeAll": true, + "label": null, + "multi": true, + "name": "instance", + "options": [ + + ], + "query": { + "query": "label_values(grafana_build_info, instance)", + "refId": "Billing Admin-instance-Variable-Query" + }, + "refresh": 1, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-6h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Grafana Overview", + "uid": "6be0s85Mk", + "version": 2 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-namespace.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-namespace.yaml new file mode 100644 index 00000000..1df84c21 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-namespace.yaml @@ -0,0 +1,2770 @@ +{{- /* +Generated from 'k8s-resources-namespace' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-namespace" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-namespace.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "100px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "format": "percentunit", + "id": 1, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) / sum(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "thresholds": "70,80", + "timeFrom": null, + "timeShift": null, + "title": "CPU Utilisation (from requests)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "singlestat", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "format": "percentunit", + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) / sum(kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "thresholds": "70,80", + "timeFrom": null, + "timeShift": null, + "title": "CPU Utilisation (from limits)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "singlestat", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "format": "percentunit", + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "thresholds": "70,80", + "timeFrom": null, + "timeShift": null, + "title": "Memory Utilisation (from requests)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "singlestat", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "format": "percentunit", + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\", image!=\"\"}) / sum(kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"})", + "format": "time_series", + "instant": true, + "intervalFactor": 2, + "refId": "A" + } + ], + "thresholds": "70,80", + "timeFrom": null, + "timeShift": null, + "title": "Memory Utilisation (from limits)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "singlestat", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": false, + "title": "Headlines", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 5, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "quota - requests", + "color": "#F2495C", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "quota - limits", + "color": "#FF9830", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"requests.cpu\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"limits.cpu\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 6, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "quota - requests", + "color": "#F2495C", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "quota - limits", + "color": "#FF9830", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"requests.memory\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"limits.memory\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage (w/o cache)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 8, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Memory Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Usage (RSS)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Cache)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Swap)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #H", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"\", image!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{namespace=\"$namespace\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(container_memory_rss{namespace=\"$namespace\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sum(container_memory_cache{namespace=\"$namespace\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + }, + { + "expr": "sum(container_memory_swap{namespace=\"$namespace\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "H", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 9, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Current Receive Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Transmit Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Network Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Network Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 10, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 11, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 12, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 13, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 14, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 15, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets Dropped", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": -1, + "fill": 10, + "id": 16, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "ceil(sum by(pod) (rate(container_fs_reads_total{container!=\"\", device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_total{container!=\"\", device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", namespace=\"$namespace\"}[$__rate_interval])))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "IOPS(Reads+Writes)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 17, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by(pod) (rate(container_fs_reads_bytes_total{container!=\"\", device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{container!=\"\", device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", namespace=\"$namespace\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "ThroughPut(Read+Write)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 18, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "sort": { + "col": 4, + "desc": true + }, + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "IOPS(Reads)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Reads + Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Throughput(Read)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Read + Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum by(pod) (rate(container_fs_reads_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_writes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_reads_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_reads_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_writes_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_reads_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\",namespace=\"$namespace\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Storage IO", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO - Distribution", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(kube_pod_info, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Namespace (Pods)", + "uid": "85a562078cdf77779eaa1add43ccec1e", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-node.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-node.yaml new file mode 100644 index 00000000..dc78400b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-node.yaml @@ -0,0 +1,961 @@ +{{- /* +Generated from 'k8s-resources-node' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-node" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-node.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 1, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{node=~\"$node\"}) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{node=~\"$node\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{node=~\"$node\"}) by (pod) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{node=~\"$node\", container!=\"\"}) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage (w/o cache)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Memory Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Usage (RSS)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Cache)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Swap)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #H", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{node=~\"$node\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{node=~\"$node\",container!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_memory_working_set_bytes{node=~\"$node\",container!=\"\"}) by (pod) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{node=~\"$node\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_memory_rss{node=~\"$node\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_memory_cache{node=~\"$node\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_memory_swap{node=~\"$node\",container!=\"\"}) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "H", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Quota", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": true, + "name": "node", + "options": [ + + ], + "query": "label_values(kube_pod_info, node)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Node (Pods)", + "uid": "200ac8fdbfbb74b39aff88118e4d1c2c", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-pod.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-pod.yaml new file mode 100644 index 00000000..4a4a93c0 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-pod.yaml @@ -0,0 +1,2442 @@ +{{- /* +Generated from 'k8s-resources-pod' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-pod" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-pod.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 1, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "requests", + "color": "#F2495C", + "fill": 0, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "limits", + "color": "#FF9830", + "fill": 0, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}container{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\", resource=\"cpu\"}\n)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\", resource=\"cpu\"}\n)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "max": true, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(increase(container_cpu_cfs_throttled_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval])) by (container) /sum(increase(container_cpu_cfs_periods_total{namespace=\"$namespace\", pod=\"$pod\", container!=\"\"}[$__rate_interval])) by (container)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}container{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + { + "colorMode": "critical", + "fill": true, + "line": true, + "op": "gt", + "value": 0.25, + "yaxis": "left" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Throttling", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "percentunit", + "label": null, + "logBase": 1, + "max": 1, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Throttling", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Container", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "container", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(cluster:namespace:pod_cpu:active:kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "requests", + "color": "#F2495C", + "dashes": true, + "fill": 0, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "limits", + "color": "#FF9830", + "dashes": true, + "fill": 0, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}container{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\", resource=\"memory\"}\n)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage (WSS)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 5, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Memory Usage (WSS)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Usage (RSS)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Cache)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Usage (Swap)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #H", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Container", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "container", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\", pod=\"$pod\", image!=\"\"}) by (container) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_requests{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\", pod=\"$pod\", container!=\"\", image!=\"\"}) by (container) / sum(cluster:namespace:pod_memory:active:kube_pod_container_resource_limits{namespace=\"$namespace\", pod=\"$pod\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(container_memory_rss{namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sum(container_memory_cache{namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + }, + { + "expr": "sum(container_memory_swap{namespace=\"$namespace\", pod=\"$pod\", container != \"\", container != \"POD\"}) by (container)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "H", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 6, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 8, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 9, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 10, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 11, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])) by (pod)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets Dropped", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": -1, + "fill": 10, + "id": 12, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "ceil(sum by(pod) (rate(container_fs_reads_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Reads", + "legendLink": null, + "step": 10 + }, + { + "expr": "ceil(sum by(pod) (rate(container_fs_writes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval])))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Writes", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "IOPS", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 13, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by(pod) (rate(container_fs_reads_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Reads", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum by(pod) (rate(container_fs_writes_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=~\"$pod\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Writes", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "ThroughPut", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO - Distribution(Pod - Read & Writes)", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": -1, + "fill": 10, + "id": 14, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "ceil(sum by(container) (rate(container_fs_reads_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval])))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}container{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "IOPS(Reads+Writes)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 15, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by(container) (rate(container_fs_reads_bytes_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}container{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "ThroughPut(Read+Write)", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO - Distribution(Containers)", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 16, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "sort": { + "col": 4, + "desc": true + }, + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "IOPS(Reads)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Reads + Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Throughput(Read)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Read + Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Container", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "container", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum by(container) (rate(container_fs_reads_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum by(container) (rate(container_fs_writes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum by(container) (rate(container_fs_reads_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum by(container) (rate(container_fs_reads_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum by(container) (rate(container_fs_writes_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum by(container) (rate(container_fs_reads_bytes_total{device=~\"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+)\", container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]) + rate(container_fs_writes_bytes_total{container!=\"\", namespace=\"$namespace\", pod=\"$pod\"}[$__rate_interval]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Storage IO", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO - Distribution", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(kube_pod_info, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "pod", + "options": [ + + ], + "query": "label_values(kube_pod_info{namespace=\"$namespace\"}, pod)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Pod", + "uid": "6581e46e4e5c7ba40a07646395ef7b23", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-project.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-project.yaml new file mode 100644 index 00000000..56d9c77f --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-project.yaml @@ -0,0 +1,2480 @@ +{{- /* +Generated from 'k8s-resources-cluster' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-cluster" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-cluster.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{}) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Pods", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Workloads", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to workloads", + "linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Namespace", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", + "pattern": "namespace", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(kube_pod_owner{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "count(avg(namespace_workload_pod:kube_pod_owner:relabel{}) by (workload, namespace)) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(namespace_cpu:kube_pod_container_resource_requests:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{}) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_requests:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(namespace_cpu:kube_pod_container_resource_limits:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{}) by (namespace) / sum(namespace_cpu:kube_pod_container_resource_limits:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 9, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(container_memory_rss{container!=\"\"}) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage (w/o cache)", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 10, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Pods", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Workloads", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to workloads", + "linkUrl": "/d/a87fb0d919ec0ea5f6543124e16c42a5/k8s-resources-workloads-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell_1", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Memory Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Namespace", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", + "pattern": "namespace", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(kube_pod_owner{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "count(avg(namespace_workload_pod:kube_pod_owner:relabel{}) by (workload, namespace)) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(container_memory_rss{container!=\"\"}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(namespace_memory:kube_pod_container_resource_requests:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(container_memory_rss{container!=\"\"}) by (namespace) / sum(namespace_memory:kube_pod_container_resource_requests:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(namespace_memory:kube_pod_container_resource_limits:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sum(container_memory_rss{container!=\"\"}) by (namespace) / sum(namespace_memory:kube_pod_container_resource_limits:sum{}) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Requests by Namespace", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Requests", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 11, + "interval": "1m", + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Current Receive Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Transmit Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Namespace", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", + "pattern": "namespace", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Network Usage", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Network Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 12, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 13, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 14, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "avg(irate(container_network_receive_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Namespace: Received", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 15, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "avg(irate(container_network_transmit_bytes_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Namespace: Transmitted", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Container Bandwidth by Namespace", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 16, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 17, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 18, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 19, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\".+\"}[$__rate_interval])) by (namespace)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets Dropped", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": -1, + "fill": 10, + "id": 20, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "ceil(sum by(namespace) (rate(container_fs_reads_total{container!=\"\"}[5m]) + rate(container_fs_writes_total{container!=\"\"}[5m])))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "IOPS(Reads+Writes)", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 21, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{container!=\"\"}[5m]) + rate(container_fs_writes_bytes_total{container!=\"\"}[5m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}namespace{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "ThroughPut(Read+Write)", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 22, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "sort": { + "col": 4, + "desc": true + }, + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "IOPS(Reads)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "IOPS(Reads + Writes)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": -1, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Throughput(Read)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Throughput(Read + Write)", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Namespace", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "/d/85a562078cdf77779eaa1add43ccec1e/k8s-resources-namespace?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$__cell", + "pattern": "namespace", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum by(namespace) (rate(container_fs_reads_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum by(namespace) (rate(container_fs_writes_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum by(namespace) (rate(container_fs_reads_total{container!=\"\"}[5m]) + rate(container_fs_writes_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum by(namespace) (rate(container_fs_writes_bytes_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum by(namespace) (rate(container_fs_reads_bytes_total{container!=\"\"}[5m]) + rate(container_fs_writes_bytes_total{container!=\"\"}[5m]))", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Storage IO", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage IO - Distribution", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "default", + "value": "default" + }, + "hide": 0, + "label": null, + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Project", + "uid": "efa86fd1d0c121a26444b636a3f509a8", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-namespace.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-namespace.yaml new file mode 100644 index 00000000..1f30675a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-namespace.yaml @@ -0,0 +1,24 @@ +{{- /* +Generated from 'k8s-resources-windows-namespace' from https://github.com/kubernetes-monitoring/kubernetes-mixin.git +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled .Values.windowsMonitoring.enabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack-grafana.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-windows-namespace" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-windows-namespace.json: |- + {{`{"__inputs":[],"__requires":[],"annotations":{"list":[]},"editable":`}}{{ .Values.grafana.defaultDashboardsEditable }}{{`,"gnetId":null,"graphTooltip":0,"hideControls":false,"id":null,"links":[],"refresh":"","rows":[{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":2,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":true,"steppedLine":false,"targets":[{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\"}) by (pod)","format":"time_series","legendFormat":"{{pod}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"CPU Usage","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"CPU Usage","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":3,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"CPU Usage","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Requests","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Requests %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #C","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"CPU Limits","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #D","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Limits %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #E","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Pod","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":true,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"/d/40597a704a610e936dc6ed374a7ce023/k8s-resources-windows-pod?var-datasource=$datasource&var-namespace=$namespace&var-pod=`}}{{ if .Values.grafana.sidecar.dashboards.enableNewTablePanelSyntax }}${__value.text}{{ else }}$__cell{{ end }}{{`","pattern":"pod","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"sum(kube_pod_windows_container_resource_cpu_cores_request{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"B"},{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\"}) by (pod) / sum(kube_pod_windows_container_resource_cpu_cores_request{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"C"},{"expr":"sum(kube_pod_windows_container_resource_cpu_cores_limit{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"D"},{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\"}) by (pod) / sum(kube_pod_windows_container_resource_cpu_cores_limit{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"E"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"CPU Quota","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"CPU Quota","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":4,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":true,"steppedLine":false,"targets":[{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\"}) by (pod)","format":"time_series","legendFormat":"{{pod}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Memory Usage","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"decbytes","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Memory Usage","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":5,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"Memory Usage","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Requests","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Requests %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #C","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Memory Limits","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #D","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Limits %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #E","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Pod","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":true,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"/d/40597a704a610e936dc6ed374a7ce023/k8s-resources-windows-pod?var-datasource=$datasource&var-namespace=$namespace&var-pod=`}}{{ if .Values.grafana.sidecar.dashboards.enableNewTablePanelSyntax }}${__value.text}{{ else }}$__cell{{ end }}{{`","pattern":"pod","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"sum(kube_pod_windows_container_resource_memory_request{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"B"},{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\"}) by (pod) / sum(kube_pod_windows_container_resource_memory_request{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"C"},{"expr":"sum(kube_pod_windows_container_resource_memory_limit{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"D"},{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\"}) by (pod) / sum(kube_pod_windows_container_resource_memory_limit{namespace=\"$namespace\"}) by (pod)","format":"table","instant":true,"legendFormat":"","refId":"E"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Memory Quota","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Memory Quota","titleSize":"h6"}],"schemaVersion":14,"style":"dark","tags":["kubernetes-mixin"],"templating":{"list":[{"current":{"selected":true,"text":"default","value":"default"},"hide":0,"label":null,"name":"datasource","options":[],"query":"prometheus","refresh":1,"regex":"","type":"datasource"},{"allValue":null,"current":{},"datasource":"$datasource","hide":`}}{{ if .Values.grafana.sidecar.dashboards.multicluster.global.enabled }}0{{ else }}2{{ end }}{{`,"includeAll":false,"label":"cluster","multi":false,"name":"cluster","options":[],"query":"label_values(up{job=\"windows-exporter\"}, cluster)","refresh":2,"regex":"","sort":1,"tagValuesQuery":"","tags":[],"tagsQuery":"","type":"query","useTags":false},{"allValue":null,"current":{},"datasource":"$datasource","hide":0,"includeAll":false,"label":"Namespace","multi":false,"name":"namespace","options":[],"query":"label_values(windows_pod_container_available{cluster=\"$cluster\"}, namespace)","refresh":2,"regex":"","sort":1,"tagValuesQuery":"","tags":[],"tagsQuery":"","type":"query","useTags":false}]},"time":{"from":"now-6h","to":"now"},"timepicker":{"refresh_intervals":["5s","10s","30s","1m","5m","15m","30m","1h","2h","1d"],"time_options":["5m","15m","1h","6h","12h","24h","2d","7d","30d"]},"timezone": "`}}{{ .Values.grafana.defaultDashboardsTimezone }}{{`","title":"Kubernetes / Compute Resources / Namespace(Windows)","uid":"490b402361724ab1d4c45666c1fa9b6f","version":0}`}} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-pod.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-pod.yaml new file mode 100644 index 00000000..f8931227 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-windows-pod.yaml @@ -0,0 +1,24 @@ +{{- /* +Generated from 'k8s-resources-windows-pod' from https://github.com/kubernetes-monitoring/kubernetes-mixin.git +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled .Values.windowsMonitoring.enabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack-grafana.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-windows-pod" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-windows-pod.json: |- + {{`{"__inputs":[],"__requires":[],"annotations":{"list":[]},"editable":`}}{{ .Values.grafana.defaultDashboardsEditable }}{{`,"gnetId":null,"graphTooltip":0,"hideControls":false,"id":null,"links":[],"refresh":"","rows":[{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":2,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":true,"steppedLine":false,"targets":[{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"time_series","legendFormat":"{{container}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"CPU Usage","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"CPU Usage","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":3,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"CPU Usage","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Requests","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Requests %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #C","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"CPU Limits","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #D","thresholds":[],"type":"number","unit":"short"},{"alias":"CPU Limits %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #E","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Container","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"container","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"sum(kube_pod_windows_container_resource_cpu_cores_request{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"B"},{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_windows_container_resource_cpu_cores_request{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"C"},{"expr":"sum(kube_pod_windows_container_resource_cpu_cores_limit{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"D"},{"expr":"sum(namespace_pod_container:windows_container_cpu_usage_seconds_total:sum_rate{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_windows_container_resource_cpu_cores_limit{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"E"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"CPU Quota","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"CPU Quota","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":10,"id":4,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":0,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":true,"steppedLine":false,"targets":[{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"time_series","legendFormat":"{{container}}","legendLink":null}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Memory Usage","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Memory Usage","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"id":5,"legend":{"avg":false,"current":false,"max":false,"min":false,"show":true,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null as zero","percentage":false,"pointradius":5,"points":false,"renderer":"flot","seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"styles":[{"alias":"Time","dateFormat":"YYYY-MM-DD HH:mm:ss","pattern":"Time","type":"hidden"},{"alias":"Memory Usage","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #A","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Requests","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #B","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Requests %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #C","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Memory Limits","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #D","thresholds":[],"type":"number","unit":"decbytes"},{"alias":"Memory Limits %","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"Value #E","thresholds":[],"type":"number","unit":"percentunit"},{"alias":"Container","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"link":false,"linkTargetBlank":false,"linkTooltip":"Drill down","linkUrl":"","pattern":"container","thresholds":[],"type":"number","unit":"short"},{"alias":"","colorMode":null,"colors":[],"dateFormat":"YYYY-MM-DD HH:mm:ss","decimals":2,"pattern":"/.*/","thresholds":[],"type":"string","unit":"short"}],"targets":[{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"A"},{"expr":"sum(kube_pod_windows_container_resource_memory_request{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"B"},{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_windows_container_resource_memory_request{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"C"},{"expr":"sum(kube_pod_windows_container_resource_memory_limit{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"D"},{"expr":"sum(windows_container_private_working_set_usage{namespace=\"$namespace\", pod=\"$pod\"}) by (container) / sum(kube_pod_windows_container_resource_memory_limit{namespace=\"$namespace\", pod=\"$pod\"}) by (container)","format":"table","instant":true,"legendFormat":"","refId":"E"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Memory Quota","tooltip":{"shared":true,"sort":2,"value_type":"individual"},"transform":"table","type":"table","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"short","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"short","label":null,"logBase":1,"max":null,"min":null,"show":false}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Memory Quota","titleSize":"h6"},{"collapse":false,"height":"250px","panels":[{"aliasColors":{},"bars":false,"dashLength":10,"dashes":false,"datasource":"$datasource","fill":1,"fillGradient":0,"id":6,"legend":{"alignAsTable":true,"avg":true,"current":true,"max":false,"min":false,"rightSide":true,"show":true,"sideWidth":null,"total":false,"values":false},"lines":true,"linewidth":1,"links":[],"nullPointMode":"null","percentage":false,"pointradius":5,"points":false,"renderer":"flot","repeat":null,"seriesOverrides":[],"spaceLength":10,"span":12,"stack":false,"steppedLine":false,"targets":[{"expr":"sort_desc(sum by (container) (rate(windows_container_network_received_bytes_total{namespace=\"$namespace\", pod=\"$pod\"}[1m])))","format":"time_series","intervalFactor":2,"legendFormat":"Received : {{ container }}","refId":"A"},{"expr":"sort_desc(sum by (container) (rate(windows_container_network_transmitted_bytes_total{namespace=\"$namespace\", pod=\"$pod\"}[1m])))","format":"time_series","intervalFactor":2,"legendFormat":"Transmitted : {{ container }}","refId":"B"}],"thresholds":[],"timeFrom":null,"timeShift":null,"title":"Network I/O","tooltip":{"shared":true,"sort":0,"value_type":"individual"},"type":"graph","xaxis":{"buckets":null,"mode":"time","name":null,"show":true,"values":[]},"yaxes":[{"format":"bytes","label":null,"logBase":1,"max":null,"min":0,"show":true},{"format":"bytes","label":null,"logBase":1,"max":null,"min":0,"show":true}]}],"repeat":null,"repeatIteration":null,"repeatRowId":null,"showTitle":true,"title":"Network I/O","titleSize":"h6"}],"schemaVersion":14,"style":"dark","tags":["kubernetes-mixin"],"templating":{"list":[{"current":{"text":"default","value":"default"},"hide":0,"label":null,"name":"datasource","options":[],"query":"prometheus","refresh":1,"regex":"","type":"datasource"},{"allValue":null,"current":{},"datasource":"$datasource","hide":`}}{{ if .Values.grafana.sidecar.dashboards.multicluster.global.enabled }}0{{ else }}2{{ end }}{{`,"includeAll":false,"label":"cluster","multi":false,"name":"cluster","options":[],"query":"label_values(up{job=\"windows-exporter\"}, cluster)","refresh":2,"regex":"","sort":1,"tagValuesQuery":"","tags":[],"tagsQuery":"","type":"query","useTags":false},{"allValue":null,"current":{},"datasource":"$datasource","hide":0,"includeAll":false,"label":"Namespace","multi":false,"name":"namespace","options":[],"query":"label_values(windows_pod_container_available{cluster=\"$cluster\"}, namespace)","refresh":2,"regex":"","sort":1,"tagValuesQuery":"","tags":[],"tagsQuery":"","type":"query","useTags":false},{"allValue":null,"current":{},"datasource":"$datasource","hide":0,"includeAll":false,"label":"Pod","multi":false,"name":"pod","options":[],"query":"label_values(windows_pod_container_available{cluster=\"$cluster\",namespace=\"$namespace\"}, pod)","refresh":2,"regex":"","sort":1,"tagValuesQuery":"","tags":[],"tagsQuery":"","type":"query","useTags":false}]},"time":{"from":"now-6h","to":"now"},"timepicker":{"refresh_intervals":["5s","10s","30s","1m","5m","15m","30m","1h","2h","1d"],"time_options":["5m","15m","1h","6h","12h","24h","2d","7d","30d"]},"timezone": "`}}{{ .Values.grafana.defaultDashboardsTimezone }}{{`","title":"Kubernetes / Compute Resources / Pod(Windows)","uid":"40597a704a610e936dc6ed374a7ce023","version":0}`}} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workload.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workload.yaml new file mode 100644 index 00000000..f7193dd6 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workload.yaml @@ -0,0 +1,1997 @@ +{{- /* +Generated from 'k8s-resources-workload' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-workload" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-workload.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 1, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Memory Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n/sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\", workload_type=\"$type\"}\n) by (pod)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 5, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Current Receive Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Transmit Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/6581e46e4e5c7ba40a07646395ef7b23/k8s-resources-pod?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Network Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Network Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 6, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 8, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(avg(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Pod: Received", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 9, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(avg(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Pod: Transmitted", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Container Bandwidth by Pod", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 10, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 11, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 12, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 13, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets Dropped", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(kube_pod_info, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "workload", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\"}, workload)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "type", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=\"$workload\"}, workload_type)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Workload", + "uid": "a164a7f0339f99e89cea5cb47e9be617", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workloads-namespace.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workloads-namespace.yaml new file mode 100644 index 00000000..eb7b8755 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/k8s-resources-workloads-namespace.yaml @@ -0,0 +1,2162 @@ +{{- /* +Generated from 'k8s-resources-workloads-namespace' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "k8s-resources-workloads-namespace" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + k8s-resources-workloads-namespace.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 1, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "quota - requests", + "color": "#F2495C", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "quota - limits", + "color": "#FF9830", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}} - {{`{{`}}workload_type{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"requests.cpu\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"limits.cpu\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Running Pods", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "CPU Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "CPU Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Workload", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2", + "pattern": "workload", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Workload Type", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "workload_type", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "count(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(\n node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace=\"$namespace\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"cpu\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "CPU Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "CPU Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "quota - requests", + "color": "#F2495C", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + }, + { + "alias": "quota - limits", + "color": "#FF9830", + "dashes": true, + "fill": 0, + "hiddenSeries": true, + "hideTooltip": true, + "legend": true, + "linewidth": 2, + "stack": false + } + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}} - {{`{{`}}workload_type{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"requests.memory\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - requests", + "legendLink": null, + "step": 10 + }, + { + "expr": "scalar(kube_resourcequota{namespace=\"$namespace\", type=\"hard\",resource=\"limits.memory\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "quota - limits", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Running Pods", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 0, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Memory Usage", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Requests %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Memory Limits", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "bytes" + }, + { + "alias": "Memory Limits %", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "percentunit" + }, + { + "alias": "Workload", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$__cell_2", + "pattern": "workload", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Workload Type", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "workload_type", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "count(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload, workload_type)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_requests{namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(\n container_memory_working_set_bytes{namespace=\"$namespace\", container!=\"\", image!=\"\"}\n * on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n/sum(\n kube_pod_container_resource_limits{namespace=\"$namespace\", resource=\"memory\"}\n* on(namespace,pod)\n group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}\n) by (workload, workload_type)\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Memory Quota", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Memory Quota", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 5, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Current Receive Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Transmit Bandwidth", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Workload", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTargetBlank": false, + "linkTooltip": "Drill down to pods", + "linkUrl": "d/a164a7f0339f99e89cea5cb47e9be617/k8s-resources-workload?var-datasource=$datasource&var-cluster=$cluster&var-namespace=$namespace&var-workload=$__cell&var-type=$type", + "pattern": "workload", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Workload Type", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "workload_type", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Network Usage", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Network Usage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 6, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 8, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(avg(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Workload: Received", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 9, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(avg(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Container Bandwidth by Workload: Transmitted", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Container Bandwidth by Workload", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 10, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 11, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 12, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 13, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$__rate_interval])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": false, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Rate of Packets Dropped", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "deployment", + "value": "deployment" + }, + "datasource": "$datasource", + "definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\"}, workload_type)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "type", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\"}, workload_type)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(kube_pod_info, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Compute Resources / Namespace (Workloads)", + "uid": "a87fb0d919ec0ea5f6543124e16c42a5", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-pod.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-pod.yaml new file mode 100644 index 00000000..90899ee8 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-pod.yaml @@ -0,0 +1,1438 @@ +{{- /* +Generated from 'namespace-by-pod' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "namespace-by-pod" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + namespace-by-pod.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "panels": [ + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 2, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "#299c46", + "rgba(237, 129, 40, 0.89)", + "#d44a3a" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "time_series", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "height": 9, + "id": 3, + "interval": null, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "minSpan": 12, + "nullPointMode": "connected", + "nullText": null, + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 10000000000, + "min": 0, + "title": "$namespace", + "unit": "Bps" + }, + "mappings": [ + + ], + "override": { + + }, + "thresholds": [ + { + "color": "dark-green", + "index": 0, + "value": null + }, + { + "color": "dark-yellow", + "index": 1, + "value": 5000000000 + }, + { + "color": "dark-red", + "index": 2, + "value": 7000000000 + } + ], + "values": false + } + }, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 12, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution]))", + "format": "time_series", + "instant": null, + "intervalFactor": 1, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "", + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Received", + "type": "gauge", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "#299c46", + "rgba(237, 129, 40, 0.89)", + "#d44a3a" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "time_series", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 1 + }, + "height": 9, + "id": 4, + "interval": null, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "minSpan": 12, + "nullPointMode": "connected", + "nullText": null, + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 10000000000, + "min": 0, + "title": "$namespace", + "unit": "Bps" + }, + "mappings": [ + + ], + "override": { + + }, + "thresholds": [ + { + "color": "dark-green", + "index": 0, + "value": null + }, + { + "color": "dark-yellow", + "index": 1, + "value": 5000000000 + }, + { + "color": "dark-red", + "index": 2, + "value": 7000000000 + } + ], + "values": false + } + }, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 12, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution]))", + "format": "time_series", + "instant": null, + "intervalFactor": 1, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "", + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Transmitted", + "type": "gauge", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + }, + { + "columns": [ + { + "text": "Time", + "value": "Time" + }, + { + "text": "Value #A", + "value": "Value #A" + }, + { + "text": "Value #B", + "value": "Value #B" + }, + { + "text": "Value #C", + "value": "Value #C" + }, + { + "text": "Value #D", + "value": "Value #D" + }, + { + "text": "Value #E", + "value": "Value #E" + }, + { + "text": "Value #F", + "value": "Value #F" + }, + { + "text": "pod", + "value": "pod" + } + ], + "datasource": "$datasource", + "fill": 1, + "fontSize": "100%", + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 5, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null as zero", + "renderer": "flot", + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": false + }, + "spaceLength": 10, + "span": 24, + "styles": [ + { + "alias": "Time", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Time", + "thresholds": [ + + ], + "type": "hidden", + "unit": "short" + }, + { + "alias": "Bandwidth Received", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Bandwidth Transmitted", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Pod", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTooltip": "Drill down", + "linkUrl": "d/7a18067ce943a40ae25454675c19ff5c/kubernetes-networking-pod?orgId=1&refresh=30s&var-namespace=$namespace&var-pod=$__cell", + "pattern": "pod", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Status", + "type": "table" + }, + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 19 + }, + "id": 6, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 20 + }, + "id": 7, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 20 + }, + "id": 8, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 29 + }, + "id": 9, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 0, + "y": 30 + }, + "id": 10, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 12, + "y": 30 + }, + "id": 11, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Packets", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 30 + }, + "id": 12, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 0, + "y": 40 + }, + "id": 13, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 12, + "y": 40 + }, + "id": 14, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Errors", + "titleSize": "h6", + "type": "row" + } + ], + "refresh": "10s", + "rows": [ + + ], + "schemaVersion": 18, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".+", + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "kube-system", + "value": "kube-system" + }, + "datasource": "$datasource", + "definition": "label_values(container_network_receive_packets_total, namespace)", + "hide": 0, + "includeAll": true, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(container_network_receive_packets_total, namespace)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "resolution", + "options": [ + { + "selected": false, + "text": "30s", + "value": "30s" + }, + { + "selected": true, + "text": "5m", + "value": "5m" + }, + { + "selected": false, + "text": "1h", + "value": "1h" + } + ], + "query": "30s,5m,1h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": false, + "label": null, + "multi": false, + "name": "interval", + "options": [ + { + "selected": true, + "text": "4h", + "value": "4h" + } + ], + "query": "4h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Networking / Namespace (Pods)", + "uid": "8b7a8b326d7a6f1f04244066368c67af", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-workload.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-workload.yaml new file mode 100644 index 00000000..092cb390 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/namespace-by-workload.yaml @@ -0,0 +1,1710 @@ +{{- /* +Generated from 'namespace-by-workload' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "namespace-by-workload" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + namespace-by-workload.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "panels": [ + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 2, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "id": 3, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} workload {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 1 + }, + "id": 4, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} workload {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "columns": [ + { + "text": "Time", + "value": "Time" + }, + { + "text": "Value #A", + "value": "Value #A" + }, + { + "text": "Value #B", + "value": "Value #B" + }, + { + "text": "Value #C", + "value": "Value #C" + }, + { + "text": "Value #D", + "value": "Value #D" + }, + { + "text": "Value #E", + "value": "Value #E" + }, + { + "text": "Value #F", + "value": "Value #F" + }, + { + "text": "Value #G", + "value": "Value #G" + }, + { + "text": "Value #H", + "value": "Value #H" + }, + { + "text": "workload", + "value": "workload" + } + ], + "datasource": "$datasource", + "fill": 1, + "fontSize": "90%", + "gridPos": { + "h": 9, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 5, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null as zero", + "renderer": "flot", + "scroll": true, + "showHeader": true, + "sort": { + "col": 0, + "desc": false + }, + "spaceLength": 10, + "span": 24, + "styles": [ + { + "alias": "Time", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Time", + "thresholds": [ + + ], + "type": "hidden", + "unit": "short" + }, + { + "alias": "Current Bandwidth Received", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Current Bandwidth Transmitted", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Average Bandwidth Received", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #C", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Average Bandwidth Transmitted", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #D", + "thresholds": [ + + ], + "type": "number", + "unit": "Bps" + }, + { + "alias": "Rate of Received Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #E", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #F", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Received Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #G", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Rate of Transmitted Packets Dropped", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #H", + "thresholds": [ + + ], + "type": "number", + "unit": "pps" + }, + { + "alias": "Workload", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": true, + "linkTooltip": "Drill down", + "linkUrl": "d/728bf77cc1166d2f3133bf25846876cc/kubernetes-networking-workload?orgId=1&refresh=30s&var-namespace=$namespace&var-type=$type&var-workload=$__cell", + "pattern": "workload", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + } + ], + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + }, + { + "expr": "sort_desc(avg(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "C", + "step": 10 + }, + { + "expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "D", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "E", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "F", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "G", + "step": 10 + }, + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "H", + "step": 10 + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Status", + "type": "table" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 19 + }, + "id": 6, + "panels": [ + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 20 + }, + "id": 7, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} workload {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 20 + }, + "id": 8, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} workload {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 29 + }, + "id": 9, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth HIstory", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 38 + }, + "id": 10, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 38 + }, + "id": 11, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 39 + }, + "id": 12, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 40 + }, + "id": 13, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 40 + }, + "id": 14, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Packets", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 40 + }, + "id": 15, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 41 + }, + "id": 16, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 41 + }, + "id": 17, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{namespace=\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\", workload_type=\"$type\"}) by (workload))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}workload{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Errors", + "titleSize": "h6", + "type": "row" + } + ], + "refresh": "10s", + "rows": [ + + ], + "schemaVersion": 18, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "kube-system", + "value": "kube-system" + }, + "datasource": "$datasource", + "definition": "label_values(container_network_receive_packets_total, namespace)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(container_network_receive_packets_total, namespace)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "deployment", + "value": "deployment" + }, + "datasource": "$datasource", + "definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\"}, workload_type)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "type", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=\"$namespace\", workload=~\".+\"}, workload_type)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "resolution", + "options": [ + { + "selected": false, + "text": "30s", + "value": "30s" + }, + { + "selected": true, + "text": "5m", + "value": "5m" + }, + { + "selected": false, + "text": "1h", + "value": "1h" + } + ], + "query": "30s,5m,1h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": false, + "label": null, + "multi": false, + "name": "interval", + "options": [ + { + "selected": true, + "text": "4h", + "value": "4h" + } + ], + "query": "4h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Networking / Namespace (Workload)", + "uid": "bbb2a765a623ae38130206c7d94a160f", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/persistentvolumesusage.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/persistentvolumesusage.yaml new file mode 100644 index 00000000..287b2194 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/persistentvolumesusage.yaml @@ -0,0 +1,561 @@ +{{- /* +Generated from 'persistentvolumesusage' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "persistentvolumesusage" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + persistentvolumesusage.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + + ] + }, + "editable": false, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "refresh": "10s", + "rows": [ + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 2, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 9, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "(\n sum without(instance, node) (topk(1, (kubelet_volume_stats_capacity_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n -\n sum without(instance, node) (topk(1, (kubelet_volume_stats_available_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n)\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Used Space", + "refId": "A" + }, + { + "expr": "sum without(instance, node) (topk(1, (kubelet_volume_stats_available_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Free Space", + "refId": "B" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Volume Space Usage", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "bytes", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "$datasource", + "format": "percent", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": true, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + + }, + "id": 3, + "interval": "1m", + "legend": { + "alignAsTable": true, + "rightSide": true + }, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 3, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "max without(instance,node) (\n(\n topk(1, kubelet_volume_stats_capacity_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n -\n topk(1, kubelet_volume_stats_available_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n)\n/\ntopk(1, kubelet_volume_stats_capacity_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n* 100)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "80, 90", + "title": "Volume Space Usage", + "tooltip": { + "shared": false + }, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": false, + "title": "Dashboard Row", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 4, + "interval": "1m", + "legend": { + "alignAsTable": true, + "avg": true, + "current": true, + "max": true, + "min": true, + "rightSide": true, + "show": true, + "sideWidth": null, + "total": false, + "values": true + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 9, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes_used{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "Used inodes", + "refId": "A" + }, + { + "expr": "(\n sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n -\n sum without(instance, node) (topk(1, (kubelet_volume_stats_inodes_used{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})))\n)\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": " Free inodes", + "refId": "B" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Volume inodes Usage", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "rgba(50, 172, 45, 0.97)", + "rgba(237, 129, 40, 0.89)", + "rgba(245, 54, 54, 0.9)" + ], + "datasource": "$datasource", + "format": "percent", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": true, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + + }, + "id": 5, + "interval": "1m", + "legend": { + "alignAsTable": true, + "rightSide": true + }, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "nullPointMode": "connected", + "nullText": null, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 3, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "max without(instance,node) (\ntopk(1, kubelet_volume_stats_inodes_used{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n/\ntopk(1, kubelet_volume_stats_inodes{metrics_path=\"/metrics\", namespace=\"$namespace\", persistentvolumeclaim=\"$volume\"})\n* 100)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "80, 90", + "title": "Volume inodes Usage", + "tooltip": { + "shared": false + }, + "type": "singlestat", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": false, + "title": "Dashboard Row", + "titleSize": "h6", + "type": "row" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": "Namespace", + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(kubelet_volume_stats_capacity_bytes{metrics_path=\"/metrics\"}, namespace)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": "PersistentVolumeClaim", + "multi": false, + "name": "volume", + "options": [ + + ], + "query": "label_values(kubelet_volume_stats_capacity_bytes{metrics_path=\"/metrics\", namespace=\"$namespace\"}, persistentvolumeclaim)", + "refresh": 2, + "regex": "", + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-7d", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Persistent Volumes", + "uid": "919b92a8e8041bd567af9edab12c840c", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/pod-total.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/pod-total.yaml new file mode 100644 index 00000000..db0a5ed7 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/pod-total.yaml @@ -0,0 +1,1201 @@ +{{- /* +Generated from 'pod-total' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "pod-total" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + pod-total.json: |- + { + "__inputs": [ + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "panels": [ + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 2, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "#299c46", + "rgba(237, 129, 40, 0.89)", + "#d44a3a" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "time_series", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "height": 9, + "id": 3, + "interval": null, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "minSpan": 12, + "nullPointMode": "connected", + "nullText": null, + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 10000000000, + "min": 0, + "title": "$namespace: $pod", + "unit": "Bps" + }, + "mappings": [ + + ], + "override": { + + }, + "thresholds": [ + { + "color": "dark-green", + "index": 0, + "value": null + }, + { + "color": "dark-yellow", + "index": 1, + "value": 5000000000 + }, + { + "color": "dark-red", + "index": 2, + "value": 7000000000 + } + ], + "values": false + } + }, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 12, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution]))", + "format": "time_series", + "instant": null, + "intervalFactor": 1, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "", + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Received", + "type": "gauge", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + }, + { + "cacheTimeout": null, + "colorBackground": false, + "colorValue": false, + "colors": [ + "#299c46", + "rgba(237, 129, 40, 0.89)", + "#d44a3a" + ], + "datasource": "$datasource", + "decimals": 0, + "format": "time_series", + "gauge": { + "maxValue": 100, + "minValue": 0, + "show": false, + "thresholdLabels": false, + "thresholdMarkers": true + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 1 + }, + "height": 9, + "id": 4, + "interval": null, + "links": [ + + ], + "mappingType": 1, + "mappingTypes": [ + { + "name": "value to text", + "value": 1 + }, + { + "name": "range to text", + "value": 2 + } + ], + "maxDataPoints": 100, + "minSpan": 12, + "nullPointMode": "connected", + "nullText": null, + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 10000000000, + "min": 0, + "title": "$namespace: $pod", + "unit": "Bps" + }, + "mappings": [ + + ], + "override": { + + }, + "thresholds": [ + { + "color": "dark-green", + "index": 0, + "value": null + }, + { + "color": "dark-yellow", + "index": 1, + "value": 5000000000 + }, + { + "color": "dark-red", + "index": 2, + "value": 7000000000 + } + ], + "values": false + } + }, + "postfix": "", + "postfixFontSize": "50%", + "prefix": "", + "prefixFontSize": "50%", + "rangeMaps": [ + { + "from": "null", + "text": "N/A", + "to": "null" + } + ], + "span": 12, + "sparkline": { + "fillColor": "rgba(31, 118, 189, 0.18)", + "full": false, + "lineColor": "rgb(31, 120, 193)", + "show": false + }, + "tableColumn": "", + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution]))", + "format": "time_series", + "instant": null, + "intervalFactor": 1, + "legendFormat": "", + "refId": "A" + } + ], + "thresholds": "", + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Transmitted", + "type": "gauge", + "valueFontSize": "80%", + "valueMaps": [ + { + "op": "=", + "text": "N/A", + "value": "null" + } + ], + "valueName": "current" + }, + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 5, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 11 + }, + "id": 6, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 11 + }, + "id": 7, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 20 + }, + "id": 8, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 0, + "y": 21 + }, + "id": 9, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 12, + "y": 21 + }, + "id": 10, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Packets", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 11, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 0, + "y": 32 + }, + "id": 12, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 10, + "w": 12, + "x": 12, + "y": 32 + }, + "id": 13, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\", pod=~\"$pod\"}[$interval:$resolution])) by (pod)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Errors", + "titleSize": "h6", + "type": "row" + } + ], + "refresh": "10s", + "rows": [ + + ], + "schemaVersion": 18, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".+", + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "kube-system", + "value": "kube-system" + }, + "datasource": "$datasource", + "definition": "label_values(container_network_receive_packets_total, namespace)", + "hide": 0, + "includeAll": true, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(container_network_receive_packets_total, namespace)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".+", + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "definition": "label_values(container_network_receive_packets_total{namespace=~\"$namespace\"}, pod)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "pod", + "options": [ + + ], + "query": "label_values(container_network_receive_packets_total{namespace=~\"$namespace\"}, pod)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "resolution", + "options": [ + { + "selected": false, + "text": "30s", + "value": "30s" + }, + { + "selected": true, + "text": "5m", + "value": "5m" + }, + { + "selected": false, + "text": "1h", + "value": "1h" + } + ], + "query": "30s,5m,1h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": false, + "label": null, + "multi": false, + "name": "interval", + "options": [ + { + "selected": true, + "text": "4h", + "value": "4h" + } + ], + "query": "4h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Networking / Pod", + "uid": "7a18067ce943a40ae25454675c19ff5c", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus-remote-write.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus-remote-write.yaml new file mode 100644 index 00000000..79a0860c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus-remote-write.yaml @@ -0,0 +1,1674 @@ +{{- /* +Generated from 'prometheus-remote-write' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled .Values.prometheus.prometheusSpec.remoteWriteDashboards }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "prometheus-remote-write" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + prometheus-remote-write.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "refresh": "60s", + "rows": [ + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 2, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "(\n prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} \n- \n ignoring(remote_name, url) group_right(instance) (prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"} != 0)\n)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Highest Timestamp In vs. Highest Timestamp Sent", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 3, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "clamp_min(\n rate(prometheus_remote_storage_highest_timestamp_in_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) \n- \n ignoring (remote_name, url) group_right(instance) rate(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n, 0)\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate[5m]", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Timestamps", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 4, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(\n prometheus_remote_storage_samples_in_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])\n- \n ignoring(remote_name, url) group_right(instance) (rate(prometheus_remote_storage_succeeded_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n- \n (rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]))\n", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate, in vs. succeeded or dropped [5m]", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Samples", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 5, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "minSpan": 6, + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Shards", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 6, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_shards_max{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Max Shards", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 7, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_shards_min{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Min Shards", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 8, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_shards_desired{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Desired Shards", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Shards", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 9, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_shard_capacity{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Shard Capacity", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 10, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_remote_storage_pending_samples{cluster=~\"$cluster\", instance=~\"$instance\"} or prometheus_remote_storage_samples_pending{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Pending Samples", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Shard Details", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 11, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_tsdb_wal_segment_current{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "TSDB Current Segment", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 12, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_wal_watcher_current_segment{cluster=~\"$cluster\", instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}consumer{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Remote Write Current Segment", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Segments", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 13, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_remote_storage_dropped_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_dropped_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Dropped Samples", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 14, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_remote_storage_failed_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Failed Samples", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 15, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_remote_storage_retried_samples_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m]) or rate(prometheus_remote_storage_samples_retried_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Retried Samples", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "fillGradient": 0, + "gridPos": { + + }, + "id": 16, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 3, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_remote_storage_enqueue_retries_total{cluster=~\"$cluster\", instance=~\"$instance\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}cluster{{`}}`}}:{{`{{`}}instance{{`}}`}} {{`{{`}}remote_name{{`}}`}}:{{`{{`}}url{{`}}`}}", + "refId": "A" + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Enqueue Retries", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Misc. Rates", + "titleSize": "h6", + "type": "row" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "prometheus-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": null, + "current": { + "text": { + "selected": true, + "text": "All", + "value": "$__all" + }, + "value": { + "selected": true, + "text": "All", + "value": "$__all" + } + }, + "datasource": "$datasource", + "hide": {{ if .Values.grafana.sidecar.dashboards.multicluster.global.enabled }}0{{ else }}2{{ end }}, + "includeAll": true, + "label": null, + "multi": false, + "name": "cluster", + "options": [ + + ], + "query": "label_values(kube_pod_container_info{image=~\".*prometheus.*\"}, cluster)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + "text": { + "selected": true, + "text": "All", + "value": "$__all" + }, + "value": { + "selected": true, + "text": "All", + "value": "$__all" + } + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": null, + "multi": false, + "name": "instance", + "options": [ + + ], + "query": "label_values(prometheus_build_info{cluster=~\"$cluster\"}, instance)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "current": { + + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": null, + "multi": false, + "name": "url", + "options": [ + + ], + "query": "label_values(prometheus_remote_storage_shards{cluster=~\"$cluster\", instance=~\"$instance\"}, url)", + "refresh": 2, + "regex": "", + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-6h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Prometheus / Remote Write", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus.yaml new file mode 100644 index 00000000..96641531 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/prometheus.yaml @@ -0,0 +1,1235 @@ +{{- /* +Generated from 'prometheus' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ template "project-prometheus-stack.namespace" . }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "prometheus" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + prometheus.json: |- + { + "annotations": { + "list": [ + + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "links": [ + + ], + "refresh": "60s", + "rows": [ + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 1, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": false, + "steppedLine": false, + "styles": [ + { + "alias": "Time", + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "pattern": "Time", + "type": "hidden" + }, + { + "alias": "Count", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #A", + "thresholds": [ + + ], + "type": "hidden", + "unit": "short" + }, + { + "alias": "Uptime", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "Value #B", + "thresholds": [ + + ], + "type": "number", + "unit": "s" + }, + { + "alias": "Instance", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "instance", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Job", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "job", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "Version", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "link": false, + "linkTargetBlank": false, + "linkTooltip": "Drill down", + "linkUrl": "", + "pattern": "version", + "thresholds": [ + + ], + "type": "number", + "unit": "short" + }, + { + "alias": "", + "colorMode": null, + "colors": [ + + ], + "dateFormat": "YYYY-MM-DD HH:mm:ss", + "decimals": 2, + "pattern": "/.*/", + "thresholds": [ + + ], + "type": "string", + "unit": "short" + } + ], + "targets": [ + { + "expr": "count by (job, instance, version) (prometheus_build_info{job=~\"$job\", instance=~\"$instance\"})", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 10 + }, + { + "expr": "max by (job, instance) (time() - process_start_time_seconds{job=~\"$job\", instance=~\"$instance\"})", + "format": "table", + "instant": true, + "intervalFactor": 2, + "legendFormat": "", + "refId": "B", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Prometheus Stats", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "transform": "table", + "type": "table", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Prometheus Stats", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(prometheus_target_sync_length_seconds_sum{job=~\"$job\",instance=~\"$instance\"}[5m])) by (scrape_job) * 1e3", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}scrape_job{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Target Sync", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum(prometheus_sd_discovered_targets{job=~\"$job\",instance=~\"$instance\"})", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "Targets", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Targets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Discovery", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 1, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_target_interval_length_seconds_sum{job=~\"$job\",instance=~\"$instance\"}[5m]) / rate(prometheus_target_interval_length_seconds_count{job=~\"$job\",instance=~\"$instance\"}[5m]) * 1e3", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}interval{{`}}`}} configured", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Scrape Interval Duration", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sum by (job) (rate(prometheus_target_scrapes_exceeded_body_size_limit_total[1m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "exceeded body size limit: {{`{{`}}job{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum by (job) (rate(prometheus_target_scrapes_exceeded_sample_limit_total[1m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "exceeded sample limit: {{`{{`}}job{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum by (job) (rate(prometheus_target_scrapes_sample_duplicate_timestamp_total[1m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "duplicate timestamp: {{`{{`}}job{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum by (job) (rate(prometheus_target_scrapes_sample_out_of_bounds_total[1m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "out of bounds: {{`{{`}}job{{`}}`}}", + "legendLink": null, + "step": 10 + }, + { + "expr": "sum by (job) (rate(prometheus_target_scrapes_sample_out_of_order_total[1m]))", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "out of order: {{`{{`}}job{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Scrape failures", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 6, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 4, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_tsdb_head_samples_appended_total{job=~\"$job\",instance=~\"$instance\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}job{{`}}`}} {{`{{`}}instance{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Appended Samples", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Retrieval", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 7, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_tsdb_head_series{job=~\"$job\",instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}job{{`}}`}} {{`{{`}}instance{{`}}`}} head series", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Head Series", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "prometheus_tsdb_head_chunks{job=~\"$job\",instance=~\"$instance\"}", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}job{{`}}`}} {{`{{`}}instance{{`}}`}} head chunks", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Head Chunks", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Storage", + "titleSize": "h6" + }, + { + "collapse": false, + "height": "250px", + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 9, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "rate(prometheus_engine_query_duration_seconds_count{job=~\"$job\",instance=~\"$instance\",slice=\"inner_eval\"}[5m])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}job{{`}}`}} {{`{{`}}instance{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Query Rate", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 10, + "id": 10, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 0, + "links": [ + + ], + "nullPointMode": "null as zero", + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 6, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "max by (slice) (prometheus_engine_query_duration_seconds{quantile=\"0.9\",job=~\"$job\",instance=~\"$instance\"}) * 1e3", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{`{{`}}slice{{`}}`}}", + "legendLink": null, + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Stage Duration", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Query", + "titleSize": "h6" + } + ], + "schemaVersion": 14, + "style": "dark", + "tags": [ + "prometheus-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".+", + "current": { + "selected": true, + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "job", + "multi": true, + "name": "job", + "options": [ + + ], + "query": "label_values(prometheus_build_info{job=\"prometheus-k8s\",namespace=\"monitoring\"}, job)", + "refresh": 1, + "regex": "", + "sort": 2, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": ".+", + "current": { + "selected": true, + "text": "All", + "value": "$__all" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": true, + "label": "instance", + "multi": true, + "name": "instance", + "options": [ + + ], + "query": "label_values(prometheus_build_info{job=~\"$job\"}, instance)", + "refresh": 1, + "regex": "", + "sort": 2, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Prometheus / Overview", + "uid": "", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/workload-total.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/workload-total.yaml new file mode 100644 index 00000000..ecaea0b9 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/dashboards-1.14/workload-total.yaml @@ -0,0 +1,1412 @@ +{{- /* +Generated from 'workload-total' from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/a8ba97a150c75be42010c75d10b720c55e182f1a/manifests/grafana-dashboardDefinitions.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (or .Values.grafana.enabled .Values.grafana.forceDeployDashboards) (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.grafana.defaultDashboardsEnabled }} +apiVersion: v1 +kind: ConfigMap +metadata: + namespace: {{ .Values.grafana.defaultDashboards.namespace }} + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) "workload-total" | trunc 63 | trimSuffix "-" }} + annotations: +{{ toYaml .Values.grafana.sidecar.dashboards.annotations | indent 4 }} + labels: + {{- if $.Values.grafana.sidecar.dashboards.label }} + {{ $.Values.grafana.sidecar.dashboards.label }}: {{ ternary $.Values.grafana.sidecar.dashboards.labelValue "1" (not (empty $.Values.grafana.sidecar.dashboards.labelValue)) | quote }} + {{- end }} + app: {{ template "project-prometheus-stack.name" $ }}-grafana +{{ include "project-prometheus-stack.labels" $ | indent 4 }} +data: + workload-total.json: |- + { + "__inputs": [ + + ], + "__requires": [ + + ], + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "hideControls": false, + "id": null, + "links": [ + + ], + "panels": [ + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 0 + }, + "id": 2, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Current Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "id": 3, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} pod {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 1 + }, + "id": 4, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} pod {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Current Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 10 + }, + "id": 5, + "panels": [ + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 11 + }, + "id": 6, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} pod {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Received", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": true, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 11 + }, + "id": 7, + "legend": { + "alignAsTable": true, + "avg": false, + "current": true, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": true, + "show": true, + "sideWidth": null, + "sort": "current", + "sortDesc": true, + "total": false, + "values": true + }, + "lines": false, + "linewidth": 1, + "links": [ + + ], + "minSpan": 24, + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 24, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(avg(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}} pod {{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Average Rate of Bytes Transmitted", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "series", + "name": null, + "show": false, + "values": [ + "current" + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Average Bandwidth", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": false, + "collapsed": false, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 11 + }, + "id": 8, + "panels": [ + + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Bandwidth HIstory", + "titleSize": "h6", + "type": "row" + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 12 + }, + "id": 9, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Receive Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 12 + }, + "id": 10, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_bytes_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Transmit Bandwidth", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "Bps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 21 + }, + "id": 11, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 22 + }, + "id": 12, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 22 + }, + "id": 13, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Packets", + "titleSize": "h6", + "type": "row" + }, + { + "collapse": true, + "collapsed": true, + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 22 + }, + "id": 14, + "panels": [ + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 23 + }, + "id": 15, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_receive_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Received Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + }, + { + "aliasColors": { + + }, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fill": 2, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 23 + }, + "id": 16, + "legend": { + "alignAsTable": false, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "sideWidth": null, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [ + + ], + "minSpan": 12, + "nullPointMode": "connected", + "paceLength": 10, + "percentage": false, + "pointradius": 5, + "points": false, + "renderer": "flot", + "repeat": null, + "seriesOverrides": [ + + ], + "spaceLength": 10, + "span": 12, + "stack": true, + "steppedLine": false, + "targets": [ + { + "expr": "sort_desc(sum(irate(container_network_transmit_packets_dropped_total{namespace=~\"$namespace\"}[$interval:$resolution])\n* on (namespace,pod)\ngroup_left(workload,workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\", workload_type=\"$type\"}) by (pod))\n", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{`{{`}}pod{{`}}`}}", + "refId": "A", + "step": 10 + } + ], + "thresholds": [ + + ], + "timeFrom": null, + "timeShift": null, + "title": "Rate of Transmitted Packets Dropped", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [ + + ] + }, + "yaxes": [ + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + }, + { + "format": "pps", + "label": null, + "logBase": 1, + "max": null, + "min": 0, + "show": true + } + ] + } + ], + "repeat": null, + "repeatIteration": null, + "repeatRowId": null, + "showTitle": true, + "title": "Errors", + "titleSize": "h6", + "type": "row" + } + ], + "refresh": "10s", + "rows": [ + + ], + "schemaVersion": 18, + "style": "dark", + "tags": [ + "kubernetes-mixin" + ], + "templating": { + "list": [ + { + "current": { + "text": "Prometheus", + "value": "Prometheus" + }, + "hide": 0, + "label": "Data Source", + "name": "datasource", + "options": [ + + ], + "query": "prometheus", + "refresh": 1, + "regex": "", + "type": "datasource" + }, + { + "allValue": ".+", + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "kube-system", + "value": "kube-system" + }, + "datasource": "$datasource", + "definition": "label_values(container_network_receive_packets_total, namespace)", + "hide": 0, + "includeAll": true, + "label": null, + "multi": false, + "name": "namespace", + "options": [ + + ], + "query": "label_values(container_network_receive_packets_total, namespace)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "", + "value": "" + }, + "datasource": "$datasource", + "definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\"}, workload)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "workload", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\"}, workload)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "deployment", + "value": "deployment" + }, + "datasource": "$datasource", + "definition": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\"}, workload_type)", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "type", + "options": [ + + ], + "query": "label_values(namespace_workload_pod:kube_pod_owner:relabel{namespace=~\"$namespace\", workload=~\"$workload\"}, workload_type)", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 0, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "query", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 0, + "includeAll": false, + "label": null, + "multi": false, + "name": "resolution", + "options": [ + { + "selected": false, + "text": "30s", + "value": "30s" + }, + { + "selected": true, + "text": "5m", + "value": "5m" + }, + { + "selected": false, + "text": "1h", + "value": "1h" + } + ], + "query": "30s,5m,1h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + }, + { + "allValue": null, + "auto": false, + "auto_count": 30, + "auto_min": "10s", + "current": { + "text": "5m", + "value": "5m" + }, + "datasource": "$datasource", + "hide": 2, + "includeAll": false, + "label": null, + "multi": false, + "name": "interval", + "options": [ + { + "selected": true, + "text": "4h", + "value": "4h" + } + ], + "query": "4h", + "refresh": 2, + "regex": "", + "skipUrlSync": false, + "sort": 1, + "tagValuesQuery": "", + "tags": [ + + ], + "tagsQuery": "", + "type": "interval", + "useTags": false + } + ] + }, + "time": { + "from": "now-1h", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "{{ .Values.grafana.defaultDashboardsTimezone }}", + "title": "Kubernetes / Networking / Workload", + "uid": "728bf77cc1166d2f3133bf25846876cc", + "version": 0 + } +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/grafana/namespaces.yaml b/charts/rancher-project-monitoring/0.4.2/templates/grafana/namespaces.yaml new file mode 100644 index 00000000..1312e4ad --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/grafana/namespaces.yaml @@ -0,0 +1,13 @@ +{{- if and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled (not .Values.grafana.defaultDashboards.useExistingNamespace) }} +apiVersion: v1 +kind: Namespace +metadata: + name: {{ .Values.grafana.defaultDashboards.namespace }} + labels: + name: {{ .Values.grafana.defaultDashboards.namespace }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} + annotations: +{{- if not .Values.grafana.defaultDashboards.cleanupOnUninstall }} + helm.sh/resource-policy: "keep" +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/_rules.tpl b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/_rules.tpl new file mode 100644 index 00000000..d82cc2d3 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/_rules.tpl @@ -0,0 +1,8 @@ +{{- define "rules.names" }} +rules: + - "alertmanager.rules" + - "general.rules" + - "kubernetes-storage" + - "prometheus" + - "kubernetes-apps" +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrole.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrole.yaml new file mode 100644 index 00000000..521e56ef --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrole.yaml @@ -0,0 +1,30 @@ +{{- if and .Values.prometheus.enabled .Values.global.rbac.create }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +rules: +# This permission are not in the kube-prometheus repo +# they're grabbed from https://github.com/prometheus/prometheus/blob/master/documentation/examples/rbac-setup.yml +- apiGroups: [""] + resources: + - nodes + - nodes/metrics + - services + - endpoints + - pods + verbs: ["get", "list", "watch"] +- apiGroups: + - "networking.k8s.io" + resources: + - ingresses + verbs: ["get", "list", "watch"] +- nonResourceURLs: ["/metrics", "/metrics/cadvisor"] + verbs: ["get"] +{{- if .Values.prometheus.additionalRulesForClusterRole }} +{{ toYaml .Values.prometheus.additionalRulesForClusterRole | indent 0 }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrolebinding.yaml new file mode 100644 index 00000000..18249a0c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/clusterrolebinding.yaml @@ -0,0 +1,18 @@ +{{- if and .Values.prometheus.enabled .Values.global.rbac.create }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus +subjects: + - kind: ServiceAccount + name: {{ template "project-prometheus-stack.prometheus.serviceAccountName" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- end }} + diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/extrasecret.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/extrasecret.yaml new file mode 100644 index 00000000..61664d01 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/extrasecret.yaml @@ -0,0 +1,20 @@ +{{- if .Values.prometheus.extraSecret.data -}} +{{- $secretName := printf "prometheus-%s-extra" (include "project-prometheus-stack.fullname" . ) -}} +apiVersion: v1 +kind: Secret +metadata: + name: {{ default $secretName .Values.prometheus.extraSecret.name }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- if .Values.prometheus.extraSecret.annotations }} + annotations: +{{ toYaml .Values.prometheus.extraSecret.annotations | indent 4 }} +{{- end }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus + app.kubernetes.io/component: prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +data: +{{- range $key, $val := .Values.prometheus.extraSecret.data }} + {{ $key }}: {{ $val | b64enc | quote }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/federate.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/federate.yaml new file mode 100644 index 00000000..29d4ba20 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/federate.yaml @@ -0,0 +1,10 @@ +{{- if and .Values.prometheus.enabled .Values.federate.enabled }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-federate + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: {{ include "project-prometheus-stack.labels" . | nindent 4 }} +data: + federate-scrape-config.yaml: {{ tpl (.Files.Get "files/federate/federate-scrape-config.yaml" ) $ | b64enc | quote }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/ingress.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/ingress.yaml new file mode 100644 index 00000000..387e2b5d --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/ingress.yaml @@ -0,0 +1,77 @@ +{{- if and .Values.prometheus.enabled .Values.prometheus.ingress.enabled -}} + {{- $pathType := .Values.prometheus.ingress.pathType | default "ImplementationSpecific" -}} + {{- $serviceName := printf "%s-%s" (include "project-prometheus-stack.fullname" .) "prometheus" -}} + {{- $servicePort := .Values.prometheus.ingress.servicePort | default .Values.prometheus.service.port -}} + {{- $routePrefix := list .Values.prometheus.prometheusSpec.routePrefix -}} + {{- $paths := .Values.prometheus.ingress.paths | default $routePrefix -}} + {{- $apiIsStable := eq (include "project-prometheus-stack.ingress.isStable" .) "true" -}} + {{- $ingressSupportsPathType := eq (include "project-prometheus-stack.ingress.supportsPathType" .) "true" -}} +apiVersion: {{ include "project-prometheus-stack.ingress.apiVersion" . }} +kind: Ingress +metadata: +{{- if .Values.prometheus.ingress.annotations }} + annotations: +{{ toYaml .Values.prometheus.ingress.annotations | indent 4 }} +{{- end }} + name: {{ $serviceName }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.prometheus.ingress.labels }} +{{ toYaml .Values.prometheus.ingress.labels | indent 4 }} +{{- end }} +spec: + {{- if $apiIsStable }} + {{- if .Values.prometheus.ingress.ingressClassName }} + ingressClassName: {{ .Values.prometheus.ingress.ingressClassName }} + {{- end }} + {{- end }} + rules: + {{- if .Values.prometheus.ingress.hosts }} + {{- range $host := .Values.prometheus.ingress.hosts }} + - host: {{ tpl $host $ }} + http: + paths: + {{- range $p := $paths }} + - path: {{ tpl $p $ }} + {{- if and $pathType $ingressSupportsPathType }} + pathType: {{ $pathType }} + {{- end }} + backend: + {{- if $apiIsStable }} + service: + name: {{ $serviceName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $serviceName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- end -}} + {{- end -}} + {{- else }} + - http: + paths: + {{- range $p := $paths }} + - path: {{ tpl $p $ }} + {{- if and $pathType $ingressSupportsPathType }} + pathType: {{ $pathType }} + {{- end }} + backend: + {{- if $apiIsStable }} + service: + name: {{ $serviceName }} + port: + number: {{ $servicePort }} + {{- else }} + serviceName: {{ $serviceName }} + servicePort: {{ $servicePort }} + {{- end }} + {{- end -}} + {{- end -}} + {{- if .Values.prometheus.ingress.tls }} + tls: +{{ tpl (toYaml .Values.prometheus.ingress.tls | indent 4) . }} + {{- end -}} +{{- end -}} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/nginx-config.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/nginx-config.yaml new file mode 100644 index 00000000..c28edc07 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/nginx-config.yaml @@ -0,0 +1,68 @@ +apiVersion: v1 +kind: ConfigMap +metadata: + name: prometheus-nginx-proxy-config + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.prometheus.annotations }} + annotations: +{{ toYaml .Values.prometheus.annotations | indent 4 }} +{{- end }} +data: + nginx.conf: |- + worker_processes auto; + error_log /dev/stdout warn; + pid /var/cache/nginx/nginx.pid; + + events { + worker_connections 1024; + } + + http { + include /etc/nginx/mime.types; + log_format main '[$time_local - $status] $remote_addr - $remote_user $request ($http_referer)'; + + proxy_connect_timeout 10; + proxy_read_timeout 180; + proxy_send_timeout 5; + proxy_buffering off; + proxy_cache_path /var/cache/nginx/cache levels=1:2 keys_zone=my_zone:100m inactive=1d max_size=10g; + + server { + listen 8081; + access_log off; + + gzip on; + gzip_min_length 1k; + gzip_comp_level 2; + gzip_types text/plain application/javascript application/x-javascript text/css application/xml text/javascript image/jpeg image/gif image/png; + gzip_vary on; + gzip_disable "MSIE [1-6]\."; + + proxy_set_header Host $host; + + location / { + proxy_cache my_zone; + proxy_cache_valid 200 302 1d; + proxy_cache_valid 301 30d; + proxy_cache_valid any 5m; + proxy_cache_bypass $http_cache_control; + add_header X-Proxy-Cache $upstream_cache_status; + add_header Cache-Control "public"; + + proxy_pass http://localhost:9090/; + + sub_filter_once off; + sub_filter 'var PATH_PREFIX = "";' 'var PATH_PREFIX = ".";'; + + if ($request_filename ~ .*\.(?:js|css|jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm)$) { + expires 90d; + } + + rewrite ^/k8s/clusters/.*/proxy(.*) /$1 break; + + } + } + } diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/podDisruptionBudget.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/podDisruptionBudget.yaml new file mode 100644 index 00000000..620d033d --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/podDisruptionBudget.yaml @@ -0,0 +1,25 @@ +{{- if and .Values.prometheus.enabled .Values.prometheus.podDisruptionBudget.enabled }} +apiVersion: {{ include "project-prometheus-stack.pdb.apiVersion" . }} +kind: PodDisruptionBudget +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +spec: + {{- if .Values.prometheus.podDisruptionBudget.minAvailable }} + minAvailable: {{ .Values.prometheus.podDisruptionBudget.minAvailable }} + {{- end }} + {{- if .Values.prometheus.podDisruptionBudget.maxUnavailable }} + maxUnavailable: {{ .Values.prometheus.podDisruptionBudget.maxUnavailable }} + {{- end }} + selector: + matchLabels: + {{- if .Values.prometheus.agentMode }} + app.kubernetes.io/name: prometheus-agent + {{- else }} + app.kubernetes.io/name: prometheus + {{- end }} + operator.prometheus.io/name: {{ template "project-prometheus-stack.prometheus.crname" . }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/prometheus.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/prometheus.yaml new file mode 100644 index 00000000..3d57ea95 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/prometheus.yaml @@ -0,0 +1,455 @@ +{{- if .Values.prometheus.enabled }} +{{- if .Values.prometheus.agentMode }} +apiVersion: monitoring.coreos.com/v1alpha1 +kind: PrometheusAgent +{{- else }} +apiVersion: monitoring.coreos.com/v1 +kind: Prometheus +{{- end }} +metadata: + name: {{ template "project-prometheus-stack.prometheus.crname" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.prometheus.annotations }} + annotations: +{{ toYaml .Values.prometheus.annotations | indent 4 }} +{{- end }} +spec: +{{- if and (not .Values.prometheus.agentMode) (or .Values.prometheus.prometheusSpec.alertingEndpoints .Values.alertmanager.enabled) }} + alerting: + alertmanagers: +{{- if .Values.prometheus.prometheusSpec.alertingEndpoints }} +{{ toYaml .Values.prometheus.prometheusSpec.alertingEndpoints | indent 6 }} +{{- else if .Values.alertmanager.enabled }} + - namespace: {{ template "project-prometheus-stack.namespace" . }} + name: {{ template "project-prometheus-stack.fullname" . }}-alertmanager + port: {{ .Values.alertmanager.alertmanagerSpec.portName }} + {{- if .Values.alertmanager.alertmanagerSpec.routePrefix }} + pathPrefix: "{{ .Values.alertmanager.alertmanagerSpec.routePrefix }}" + {{- end }} + {{- if .Values.alertmanager.alertmanagerSpec.scheme }} + scheme: {{ .Values.alertmanager.alertmanagerSpec.scheme }} + {{- end }} + {{- if .Values.alertmanager.alertmanagerSpec.tlsConfig }} + tlsConfig: +{{ toYaml .Values.alertmanager.alertmanagerSpec.tlsConfig | indent 10 }} + {{- end }} + apiVersion: {{ .Values.alertmanager.apiVersion }} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.apiserverConfig }} + apiserverConfig: +{{ toYaml .Values.prometheus.prometheusSpec.apiserverConfig | indent 4}} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.image }} + {{- $registry := include "monitoring_registry" . | default .Values.prometheus.prometheusSpec.image.registry -}} + {{- if and .Values.prometheus.prometheusSpec.image.tag .Values.prometheus.prometheusSpec.image.sha }} + image: "{{ $registry }}/{{ .Values.prometheus.prometheusSpec.image.repository }}:{{ .Values.prometheus.prometheusSpec.image.tag }}@sha256:{{ .Values.prometheus.prometheusSpec.image.sha }}" + {{- else if .Values.prometheus.prometheusSpec.image.sha }} + image: "{{ $registry }}/{{ .Values.prometheus.prometheusSpec.image.repository }}@sha256:{{ .Values.prometheus.prometheusSpec.image.sha }}" + {{- else if .Values.prometheus.prometheusSpec.image.tag }} + image: "{{ $registry }}/{{ .Values.prometheus.prometheusSpec.image.repository }}:{{ .Values.prometheus.prometheusSpec.image.tag }}" + {{- else }} + image: "{{ $registry }}/{{ .Values.prometheus.prometheusSpec.image.repository }}" + {{- end }} + version: {{ default .Values.prometheus.prometheusSpec.image.tag .Values.prometheus.prometheusSpec.version }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalArgs }} + additionalArgs: +{{ toYaml .Values.prometheus.prometheusSpec.additionalArgs | indent 4}} +{{- end -}} +{{- if .Values.prometheus.prometheusSpec.externalLabels }} + externalLabels: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.externalLabels | indent 4) . }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.prometheusExternalLabelNameClear }} + prometheusExternalLabelName: "" +{{- else if .Values.prometheus.prometheusSpec.prometheusExternalLabelName }} + prometheusExternalLabelName: "{{ .Values.prometheus.prometheusSpec.prometheusExternalLabelName }}" +{{- end }} +{{- if .Values.prometheus.prometheusSpec.replicaExternalLabelNameClear }} + replicaExternalLabelName: "" +{{- else if .Values.prometheus.prometheusSpec.replicaExternalLabelName }} + replicaExternalLabelName: "{{ .Values.prometheus.prometheusSpec.replicaExternalLabelName }}" +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enableRemoteWriteReceiver }} + enableRemoteWriteReceiver: {{ .Values.prometheus.prometheusSpec.enableRemoteWriteReceiver }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.externalUrl }} + externalUrl: "{{ tpl .Values.prometheus.prometheusSpec.externalUrl . }}" +{{- else if and .Values.prometheus.ingress.enabled .Values.prometheus.ingress.hosts }} + externalUrl: "http://{{ tpl (index .Values.prometheus.ingress.hosts 0) . }}{{ .Values.prometheus.prometheusSpec.routePrefix }}" +{{- else if not (or (kindIs "invalid" .Values.global.cattle.url) (kindIs "invalid" .Values.global.cattle.clusterId)) }} + externalUrl: "{{ .Values.global.cattle.url }}/k8s/clusters/{{ .Values.global.cattle.clusterId }}/api/v1/namespaces/{{ template "project-prometheus-stack.namespace" . }}/services/http:{{ template "project-prometheus-stack.fullname" . }}-prometheus:{{ .Values.prometheus.service.port }}/proxy" +{{- else }} + externalUrl: http://{{ template "project-prometheus-stack.fullname" . }}-prometheus.{{ template "project-prometheus-stack.namespace" . }}:{{ .Values.prometheus.service.port }} +{{- end }} + nodeSelector: {{ include "linux-node-selector" . | nindent 4 }} +{{- if .Values.prometheus.prometheusSpec.nodeSelector }} +{{ toYaml .Values.prometheus.prometheusSpec.nodeSelector | indent 4 }} +{{- end }} + paused: {{ .Values.prometheus.prometheusSpec.paused }} + replicas: {{ .Values.prometheus.prometheusSpec.replicas }} + shards: {{ .Values.prometheus.prometheusSpec.shards }} + logLevel: {{ .Values.prometheus.prometheusSpec.logLevel }} + logFormat: {{ .Values.prometheus.prometheusSpec.logFormat }} + listenLocal: {{ .Values.prometheus.prometheusSpec.listenLocal }} +{{- if not .Values.prometheus.agentMode }} + enableAdminAPI: {{ .Values.prometheus.prometheusSpec.enableAdminAPI }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.web }} + web: +{{ toYaml .Values.prometheus.prometheusSpec.web | indent 4 }} +{{- end }} +{{- if and (not .Values.prometheus.agentMode) .Values.prometheus.prometheusSpec.exemplars }} + exemplars: + {{ toYaml .Values.prometheus.prometheusSpec.exemplars | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enableFeatures }} + enableFeatures: +{{- range $enableFeatures := .Values.prometheus.prometheusSpec.enableFeatures }} + - {{ tpl $enableFeatures $ }} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.scrapeInterval }} + scrapeInterval: {{ .Values.prometheus.prometheusSpec.scrapeInterval }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.scrapeTimeout }} + scrapeTimeout: {{ .Values.prometheus.prometheusSpec.scrapeTimeout }} +{{- end }} +{{- if and (not .Values.prometheus.agentMode) .Values.prometheus.prometheusSpec.evaluationInterval }} + evaluationInterval: {{ .Values.prometheus.prometheusSpec.evaluationInterval }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.resources }} + resources: +{{ toYaml .Values.prometheus.prometheusSpec.resources | indent 4 }} +{{- end }} +{{- if not .Values.prometheus.agentMode }} + retention: {{ .Values.prometheus.prometheusSpec.retention | quote }} +{{- if .Values.prometheus.prometheusSpec.retentionSize }} + retentionSize: {{ .Values.prometheus.prometheusSpec.retentionSize | quote }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.tsdb }} + tsdb: + {{- if .Values.prometheus.prometheusSpec.tsdb.outOfOrderTimeWindow }} + outOfOrderTimeWindow: {{ .Values.prometheus.prometheusSpec.tsdb.outOfOrderTimeWindow }} + {{- end }} +{{- end }} +{{- end }} +{{- if eq .Values.prometheus.prometheusSpec.walCompression false }} + walCompression: false +{{ else }} + walCompression: true +{{- end }} +{{- if .Values.prometheus.prometheusSpec.routePrefix }} + routePrefix: {{ .Values.prometheus.prometheusSpec.routePrefix | quote }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.secrets }} + secrets: +{{ toYaml .Values.prometheus.prometheusSpec.secrets | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.configMaps }} + configMaps: +{{ toYaml .Values.prometheus.prometheusSpec.configMaps | indent 4 }} +{{- end }} + serviceAccountName: {{ template "project-prometheus-stack.prometheus.serviceAccountName" . }} +{{- if .Values.prometheus.prometheusSpec.serviceMonitorSelector }} + serviceMonitorSelector: +{{ toYaml .Values.prometheus.prometheusSpec.serviceMonitorSelector | indent 4 }} +{{ else if .Values.prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues }} + serviceMonitorSelector: + matchLabels: + release: {{ $.Release.Name | quote }} +{{ else }} + serviceMonitorSelector: {} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.serviceMonitorNamespaceSelector }} + serviceMonitorNamespaceSelector: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.serviceMonitorNamespaceSelector | indent 4) . }} +{{ else }} + serviceMonitorNamespaceSelector: {} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.podMonitorSelector }} + podMonitorSelector: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.podMonitorSelector | indent 4) . }} +{{ else if .Values.prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues }} + podMonitorSelector: + matchLabels: + release: {{ $.Release.Name | quote }} +{{ else }} + podMonitorSelector: {} +{{- end }} + podMonitorNamespaceSelector: {{ .Values.global.cattle.projectNamespaceSelector | toYaml | nindent 4 }} +{{- if .Values.prometheus.prometheusSpec.probeSelector }} + probeSelector: +{{ toYaml .Values.prometheus.prometheusSpec.probeSelector | indent 4 }} +{{ else if .Values.prometheus.prometheusSpec.probeSelectorNilUsesHelmValues }} + probeSelector: + matchLabels: + release: {{ $.Release.Name | quote }} +{{ else }} + probeSelector: {} +{{- end }} + probeNamespaceSelector: {{ .Values.global.cattle.projectNamespaceSelector | toYaml | nindent 4 }} +{{- if and (not .Values.prometheus.agentMode) (or .Values.prometheus.prometheusSpec.remoteRead .Values.prometheus.prometheusSpec.additionalRemoteRead) }} + remoteRead: +{{- if .Values.prometheus.prometheusSpec.remoteRead }} +{{ tpl (toYaml .Values.prometheus.prometheusSpec.remoteRead | indent 4) . }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalRemoteRead }} +{{ toYaml .Values.prometheus.prometheusSpec.additionalRemoteRead | indent 4 }} +{{- end }} +{{- end }} +{{- if (or .Values.prometheus.prometheusSpec.remoteWrite .Values.prometheus.prometheusSpec.additionalRemoteWrite) }} + remoteWrite: +{{- if .Values.prometheus.prometheusSpec.remoteWrite }} +{{ tpl (toYaml .Values.prometheus.prometheusSpec.remoteWrite | indent 4) . }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalRemoteWrite }} +{{ toYaml .Values.prometheus.prometheusSpec.additionalRemoteWrite | indent 4 }} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.securityContext }} + securityContext: +{{ toYaml .Values.prometheus.prometheusSpec.securityContext | indent 4 }} +{{- end }} + ruleNamespaceSelector: {{ .Values.global.cattle.projectNamespaceSelector | toYaml | nindent 4 }} +{{- if not .Values.prometheus.agentMode }} +{{- if .Values.prometheus.prometheusSpec.ruleSelector }} + ruleSelector: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.ruleSelector | indent 4) . }} +{{- else if .Values.prometheus.prometheusSpec.ruleSelectorNilUsesHelmValues }} + ruleSelector: + matchLabels: + release: {{ $.Release.Name | quote }} +{{ else }} + ruleSelector: {} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.scrapeConfigSelector }} + scrapeConfigSelector: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector | indent 4) . }} +{{ else if .Values.prometheus.prometheusSpec.scrapeConfigSelectorNilUsesHelmValues }} + scrapeConfigSelector: + matchLabels: + release: {{ $.Release.Name | quote }} +{{ else }} + scrapeConfigSelector: {} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.scrapeConfigNamespaceSelector }} + scrapeConfigNamespaceSelector: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigNamespaceSelector | indent 4) . }} +{{ else }} + scrapeConfigNamespaceSelector: {} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.storageSpec }} + storage: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.storageSpec | indent 4) . }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.podMetadata }} + podMetadata: +{{ tpl (toYaml .Values.prometheus.prometheusSpec.podMetadata | indent 4) . }} +{{- end }} +{{- if and (not .Values.prometheus.agentMode) .Values.prometheus.prometheusSpec.query }} + query: +{{ toYaml .Values.prometheus.prometheusSpec.query | indent 4}} +{{- end }} +{{- if or .Values.prometheus.prometheusSpec.podAntiAffinity .Values.prometheus.prometheusSpec.affinity }} + affinity: +{{- if .Values.prometheus.prometheusSpec.affinity }} +{{ toYaml .Values.prometheus.prometheusSpec.affinity | indent 4 }} +{{- end }} +{{- if eq .Values.prometheus.prometheusSpec.podAntiAffinity "hard" }} + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - topologyKey: {{ .Values.prometheus.prometheusSpec.podAntiAffinityTopologyKey }} + labelSelector: + matchExpressions: + - {key: app.kubernetes.io/name, operator: In, values: [prometheus]} + - {key: prometheus, operator: In, values: [{{ template "kube-prometheus-stack.prometheus.crname" . }}]} +{{- else if eq .Values.prometheus.prometheusSpec.podAntiAffinity "soft" }} + podAntiAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 100 + podAffinityTerm: + topologyKey: {{ .Values.prometheus.prometheusSpec.podAntiAffinityTopologyKey }} + labelSelector: + matchExpressions: + - {key: app.kubernetes.io/name, operator: In, values: [prometheus]} + - {key: prometheus, operator: In, values: [{{ template "kube-prometheus-stack.prometheus.crname" . }}]} +{{- end }} +{{- end }} + tolerations: {{ include "linux-node-tolerations" . | nindent 4 }} +{{- if .Values.prometheus.prometheusSpec.tolerations }} +{{ toYaml .Values.prometheus.prometheusSpec.tolerations | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.topologySpreadConstraints }} + topologySpreadConstraints: +{{ toYaml .Values.prometheus.prometheusSpec.topologySpreadConstraints | indent 4 }} +{{- end }} +{{- if .Values.global.imagePullSecrets }} + imagePullSecrets: +{{ include "kube-prometheus-stack.imagePullSecrets" . | trim | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalScrapeConfigs }} + additionalScrapeConfigs: + name: {{ template "kube-prometheus-stack.fullname" . }}-prometheus-scrape-confg + key: additional-scrape-configs.yaml +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalScrapeConfigsSecret.enabled }} + additionalScrapeConfigs: + name: {{ .Values.prometheus.prometheusSpec.additionalScrapeConfigsSecret.name }} + key: {{ .Values.prometheus.prometheusSpec.additionalScrapeConfigsSecret.key }} +{{- end }} +{{- if not .Values.prometheus.agentMode }} +{{- if or .Values.prometheus.prometheusSpec.additionalAlertManagerConfigs .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret }} + additionalAlertManagerConfigs: +{{- if .Values.prometheus.prometheusSpec.additionalAlertManagerConfigs }} + name: {{ template "kube-prometheus-stack.fullname" . }}-prometheus-am-confg + key: additional-alertmanager-configs.yaml +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret }} + name: {{ .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret.name }} + key: {{ .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret.key }} + {{- if hasKey .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret "optional" }} + optional: {{ .Values.prometheus.prometheusSpec.additionalAlertManagerConfigsSecret.optional }} + {{- end }} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalAlertRelabelConfigs }} + additionalAlertRelabelConfigs: + name: {{ template "kube-prometheus-stack.fullname" . }}-prometheus-am-relabel-confg + key: additional-alert-relabel-configs.yaml +{{- end }} +{{- if .Values.prometheus.prometheusSpec.additionalAlertRelabelConfigsSecret }} + additionalAlertRelabelConfigs: + name: {{ .Values.prometheus.prometheusSpec.additionalAlertRelabelConfigsSecret.name }} + key: {{ .Values.prometheus.prometheusSpec.additionalAlertRelabelConfigsSecret.key }} +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.containers }} + containers: +{{ tpl .Values.prometheus.prometheusSpec.containers $ | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.initContainers }} + initContainers: +{{ toYaml .Values.prometheus.prometheusSpec.initContainers | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.priorityClassName }} + priorityClassName: {{ .Values.prometheus.prometheusSpec.priorityClassName }} +{{- end }} +{{- if not .Values.prometheus.agentMode }} +{{- if .Values.prometheus.prometheusSpec.thanos }} + thanos: +{{- with (omit .Values.prometheus.prometheusSpec.thanos "objectStorageConfig")}} +{{ toYaml . | indent 4 }} +{{- end }} +{{- if ((.Values.prometheus.prometheusSpec.thanos.objectStorageConfig).existingSecret) }} + objectStorageConfig: + key: "{{.Values.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.key }}" + name: "{{.Values.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret.name }}" +{{- else if ((.Values.prometheus.prometheusSpec.thanos.objectStorageConfig).secret) }} + objectStorageConfig: + key: object-storage-configs.yaml + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus +{{- end }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.disableCompaction }} + disableCompaction: {{ .Values.prometheus.prometheusSpec.disableCompaction }} +{{- end }} +{{- end }} + portName: {{ .Values.prometheus.prometheusSpec.portName }} +{{- if .Values.prometheus.prometheusSpec.volumes }} + volumes: +{{ toYaml .Values.prometheus.prometheusSpec.volumes | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.volumeMounts }} + volumeMounts: +{{ toYaml .Values.prometheus.prometheusSpec.volumeMounts | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.arbitraryFSAccessThroughSMs }} + arbitraryFSAccessThroughSMs: +{{ toYaml .Values.prometheus.prometheusSpec.arbitraryFSAccessThroughSMs | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.overrideHonorLabels }} + overrideHonorLabels: {{ .Values.prometheus.prometheusSpec.overrideHonorLabels }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.overrideHonorTimestamps }} + overrideHonorTimestamps: {{ .Values.prometheus.prometheusSpec.overrideHonorTimestamps }} +{{- end }} + ignoreNamespaceSelectors: true # always hard-coded to true for security reasons +{{- if .Values.prometheus.prometheusSpec.enforcedNamespaceLabel }} + enforcedNamespaceLabel: {{ .Values.prometheus.prometheusSpec.enforcedNamespaceLabel }} +{{- $prometheusDefaultRulesExcludedFromEnforce := (include "rules.names" .) | fromYaml }} +{{- if not .Values.prometheus.agentMode }} + prometheusRulesExcludedFromEnforce: +{{- range $prometheusDefaultRulesExcludedFromEnforce.rules }} + - ruleNamespace: "{{ template "project-prometheus-stack.namespace" $ }}" + ruleName: "{{ printf "%s-%s" (include "project-prometheus-stack.fullname" $) . | trunc 63 | trimSuffix "-" }}" +{{- end }} +{{- if .Values.prometheus.prometheusSpec.prometheusRulesExcludedFromEnforce }} +{{ toYaml .Values.prometheus.prometheusSpec.prometheusRulesExcludedFromEnforce | indent 4 }} +{{- end }} +{{- end }} + excludedFromEnforcement: +{{- range $prometheusDefaultRulesExcludedFromEnforce.rules }} + - group: monitoring.coreos.com + resource: prometheusrules + namespace: "{{ template "kube-prometheus-stack.namespace" $ }}" + name: "{{ printf "%s-%s" (include "kube-prometheus-stack.fullname" $) . | trunc 63 | trimSuffix "-" }}" +{{- end }} +{{- if .Values.prometheus.prometheusSpec.excludedFromEnforcement }} +{{ tpl (toYaml .Values.prometheus.prometheusSpec.excludedFromEnforcement | indent 4) . }} +{{- end }} +{{- end }} +{{- if and (not .Values.prometheus.agentMode) .Values.prometheus.prometheusSpec.queryLogFile }} + queryLogFile: {{ .Values.prometheus.prometheusSpec.queryLogFile }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.sampleLimit }} + sampleLimit: {{ .Values.prometheus.prometheusSpec.sampleLimit }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedKeepDroppedTargets }} + enforcedKeepDroppedTargets: {{ .Values.prometheus.prometheusSpec.enforcedKeepDroppedTargets }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedSampleLimit }} + enforcedSampleLimit: {{ .Values.prometheus.prometheusSpec.enforcedSampleLimit }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedTargetLimit }} + enforcedTargetLimit: {{ .Values.prometheus.prometheusSpec.enforcedTargetLimit }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedLabelLimit }} + enforcedLabelLimit: {{ .Values.prometheus.prometheusSpec.enforcedLabelLimit }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedLabelNameLengthLimit }} + enforcedLabelNameLengthLimit: {{ .Values.prometheus.prometheusSpec.enforcedLabelNameLengthLimit }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.enforcedLabelValueLengthLimit}} + enforcedLabelValueLengthLimit: {{ .Values.prometheus.prometheusSpec.enforcedLabelValueLengthLimit }} +{{- end }} +{{- if and (not .Values.prometheus.agentMode) .Values.prometheus.prometheusSpec.allowOverlappingBlocks }} + allowOverlappingBlocks: {{ .Values.prometheus.prometheusSpec.allowOverlappingBlocks }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.minReadySeconds }} + minReadySeconds: {{ .Values.prometheus.prometheusSpec.minReadySeconds }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.maximumStartupDurationSeconds }} + maximumStartupDurationSeconds: {{ .Values.prometheus.prometheusSpec.maximumStartupDurationSeconds }} +{{- end }} + hostNetwork: {{ .Values.prometheus.prometheusSpec.hostNetwork }} +{{- if .Values.prometheus.prometheusSpec.hostAliases }} + hostAliases: +{{ toYaml .Values.prometheus.prometheusSpec.hostAliases | indent 4 }} +{{- end }} +{{- if .Values.prometheus.prometheusSpec.tracingConfig }} + tracingConfig: +{{ toYaml .Values.prometheus.prometheusSpec.tracingConfig | indent 4 }} +{{- end }} +{{- with .Values.prometheus.prometheusSpec.additionalConfig }} + {{- tpl (toYaml .) $ | nindent 2 }} +{{- end }} +{{- with .Values.prometheus.prometheusSpec.additionalConfigString }} + {{- tpl . $ | nindent 2 }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrole.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrole.yaml new file mode 100644 index 00000000..c796248a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrole.yaml @@ -0,0 +1,22 @@ +{{- if and .Values.prometheus.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +kind: ClusterRole +apiVersion: rbac.authorization.k8s.io/v1 +metadata: + name: {{ template "kube-prometheus-stack.fullname" . }}-prometheus-psp + labels: + app: {{ template "kube-prometheus-stack.name" . }}-prometheus +{{ include "kube-prometheus-stack.labels" . | indent 4 }} +rules: +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if semverCompare "> 1.15.0-0" $kubeTargetVersion }} +- apiGroups: ['policy'] +{{- else }} +- apiGroups: ['extensions'] +{{- end }} + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: + - {{ template "kube-prometheus-stack.fullname" . }}-prometheus +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrolebinding.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrolebinding.yaml new file mode 100644 index 00000000..cf2ea901 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp-clusterrolebinding.yaml @@ -0,0 +1,19 @@ +{{- if and .Values.prometheus.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus-psp + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus-psp +subjects: + - kind: ServiceAccount + name: {{ template "project-prometheus-stack.prometheus.serviceAccountName" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp.yaml new file mode 100644 index 00000000..a53bb0e8 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/psp.yaml @@ -0,0 +1,58 @@ +{{- if and .Values.prometheus.enabled (or .Values.global.cattle.psp.enabled (and .Values.global.rbac.create .Values.global.rbac.pspEnabled)) }} +{{- if .Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy" }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{- if .Values.global.rbac.pspAnnotations }} + annotations: +{{ toYaml .Values.global.rbac.pspAnnotations | indent 4 }} +{{- end }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +spec: + privileged: false + # Allow core volume types. + volumes: + - 'configMap' + - 'emptyDir' + - 'projected' + - 'secret' + - 'downwardAPI' + - 'persistentVolumeClaim' +{{- if .Values.prometheus.podSecurityPolicy.volumes }} +{{ toYaml .Values.prometheus.podSecurityPolicy.volumes | indent 4 }} +{{- end }} + hostNetwork: false + hostIPC: false + hostPID: false + runAsUser: + # Permits the container to run with root privileges as well. + rule: 'RunAsAny' + seLinux: + # This policy assumes the nodes are using AppArmor rather than SELinux. + rule: 'RunAsAny' + supplementalGroups: + rule: 'MustRunAs' + ranges: + # Allow adding the root group. + - min: 0 + max: 65535 + fsGroup: + rule: 'MustRunAs' + ranges: + # Allow adding the root group. + - min: 0 + max: 65535 + readOnlyRootFilesystem: false +{{- if .Values.prometheus.podSecurityPolicy.allowedCapabilities }} + allowedCapabilities: +{{ toYaml .Values.prometheus.podSecurityPolicy.allowedCapabilities | indent 4 }} +{{- end }} +{{- if .Values.prometheus.podSecurityPolicy.allowedHostPaths }} + allowedHostPaths: +{{ toYaml .Values.prometheus.podSecurityPolicy.allowedHostPaths | indent 4 }} +{{- end }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/alertmanager.rules.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/alertmanager.rules.yaml new file mode 100644 index 00000000..2fdb9a77 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/alertmanager.rules.yaml @@ -0,0 +1,305 @@ +{{- /* +Generated from 'alertmanager.rules' group from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/alertmanager-prometheusRule.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.alertmanager }} +{{- $alertmanagerJob := printf "%s-%s" (include "project-prometheus-stack.fullname" .) "alertmanager" }} +{{- $namespace := printf "%s" (include "project-prometheus-stack.namespace" .) }} +{{- if and .Values.alertmanager.enabled .Values.alertmanager.serviceMonitor.selfMonitor }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" .) "alertmanager.rules" | trunc 63 | trimSuffix "-" }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.defaultRules.labels }} +{{ toYaml .Values.defaultRules.labels | indent 4 }} +{{- end }} +{{- if .Values.defaultRules.annotations }} + annotations: +{{ toYaml .Values.defaultRules.annotations | indent 4 }} +{{- end }} +spec: + groups: + - name: alertmanager.rules + rules: +{{- if not (.Values.defaultRules.disabled.AlertmanagerFailedReload | default false) }} + - alert: AlertmanagerFailedReload + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: Configuration has failed to load for {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod{{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerfailedreload + summary: Reloading an Alertmanager configuration has failed. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + max_over_time(alertmanager_config_last_reload_successful{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m]) == 0 + for: {{ dig "AlertmanagerFailedReload" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerFailedReload" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerMembersInconsistent | default false) }} + - alert: AlertmanagerMembersInconsistent + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: Alertmanager {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod{{`}}`}} has only found {{`{{`}} $value {{`}}`}} members of the {{`{{`}}$labels.job{{`}}`}} cluster. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagermembersinconsistent + summary: A member of an Alertmanager cluster has not found all other cluster members. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + max_over_time(alertmanager_cluster_members{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m]) + < on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) group_left + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) (max_over_time(alertmanager_cluster_members{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m])) + for: {{ dig "AlertmanagerMembersInconsistent" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerMembersInconsistent" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerFailedToSendAlerts | default false) }} + - alert: AlertmanagerFailedToSendAlerts + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: Alertmanager {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod{{`}}`}} failed to send {{`{{`}} $value | humanizePercentage {{`}}`}} of notifications to {{`{{`}} $labels.integration {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerfailedtosendalerts + summary: An Alertmanager instance failed to send notifications. + expr: |- + ( + rate(alertmanager_notifications_failed_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m]) + / + ignoring (reason) group_left rate(alertmanager_notifications_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m]) + ) + > 0.01 + for: {{ dig "AlertmanagerFailedToSendAlerts" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerFailedToSendAlerts" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerClusterFailedToSendAlerts | default false) }} + - alert: AlertmanagerClusterFailedToSendAlerts + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: The minimum notification failure rate to {{`{{`}} $labels.integration {{`}}`}} sent from any instance in the {{`{{`}}$labels.job{{`}}`}} cluster is {{`{{`}} $value | humanizePercentage {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerclusterfailedtosendalerts + summary: All Alertmanager instances in a cluster failed to send notifications to a critical integration. + expr: |- + min by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service, integration) ( + rate(alertmanager_notifications_failed_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}", integration=~`.*`}[5m]) + / + ignoring (reason) group_left rate(alertmanager_notifications_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}", integration=~`.*`}[5m]) + ) + > 0.01 + for: {{ dig "AlertmanagerClusterFailedToSendAlerts" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerClusterFailedToSendAlerts" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerClusterFailedToSendAlerts | default false) }} + - alert: AlertmanagerClusterFailedToSendAlerts + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: The minimum notification failure rate to {{`{{`}} $labels.integration {{`}}`}} sent from any instance in the {{`{{`}}$labels.job{{`}}`}} cluster is {{`{{`}} $value | humanizePercentage {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerclusterfailedtosendalerts + summary: All Alertmanager instances in a cluster failed to send notifications to a non-critical integration. + expr: |- + min by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service, integration) ( + rate(alertmanager_notifications_failed_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}", integration!~`.*`}[5m]) + / + ignoring (reason) group_left rate(alertmanager_notifications_total{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}", integration!~`.*`}[5m]) + ) + > 0.01 + for: {{ dig "AlertmanagerClusterFailedToSendAlerts" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerClusterFailedToSendAlerts" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerConfigInconsistent | default false) }} + - alert: AlertmanagerConfigInconsistent + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: Alertmanager instances within the {{`{{`}}$labels.job{{`}}`}} cluster have different configurations. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerconfiginconsistent + summary: Alertmanager instances within the same cluster have different configurations. + expr: |- + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service) ( + count_values by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service) ("config_hash", alertmanager_config_hash{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}) + ) + != 1 + for: {{ dig "AlertmanagerConfigInconsistent" "for" "20m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerConfigInconsistent" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerClusterDown | default false) }} + - alert: AlertmanagerClusterDown + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: '{{`{{`}} $value | humanizePercentage {{`}}`}} of Alertmanager instances within the {{`{{`}}$labels.job{{`}}`}} cluster have been up for less than half of the last 5m.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerclusterdown + summary: Half or more of the Alertmanager instances within the same cluster are down. + expr: |- + ( + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) ( + avg_over_time(up{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[5m]) < 0.5 + ) + / + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) ( + up{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"} + ) + ) + >= 0.5 + for: {{ dig "AlertmanagerClusterDown" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerClusterDown" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.AlertmanagerClusterCrashlooping | default false) }} + - alert: AlertmanagerClusterCrashlooping + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.alertmanager | indent 8 }} +{{- end }} + description: '{{`{{`}} $value | humanizePercentage {{`}}`}} of Alertmanager instances within the {{`{{`}}$labels.job{{`}}`}} cluster have restarted at least 5 times in the last 10m.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/alertmanager/alertmanagerclustercrashlooping + summary: Half or more of the Alertmanager instances within the same cluster are crashlooping. + expr: |- + ( + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) ( + changes(process_start_time_seconds{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"}[10m]) > 4 + ) + / + count by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace,service,cluster) ( + up{job="{{ $alertmanagerJob }}",namespace="{{ $namespace }}"} + ) + ) + >= 0.5 + for: {{ dig "AlertmanagerClusterCrashlooping" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "AlertmanagerClusterCrashlooping" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.alertmanager }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/general.rules.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/general.rules.yaml new file mode 100644 index 00000000..22d6a22e --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/general.rules.yaml @@ -0,0 +1,125 @@ +{{- /* +Generated from 'general.rules' group from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/kubePrometheus-prometheusRule.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.general }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" .) "general.rules" | trunc 63 | trimSuffix "-" }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.defaultRules.labels }} +{{ toYaml .Values.defaultRules.labels | indent 4 }} +{{- end }} +{{- if .Values.defaultRules.annotations }} + annotations: +{{ toYaml .Values.defaultRules.annotations | indent 4 }} +{{- end }} +spec: + groups: + - name: general.rules + rules: +{{- if not (.Values.defaultRules.disabled.TargetDown | default false) }} + - alert: TargetDown + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.general }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.general | indent 8 }} +{{- end }} + description: '{{`{{`}} printf "%.4g" $value {{`}}`}}% of the {{`{{`}} $labels.job {{`}}`}}/{{`{{`}} $labels.service {{`}}`}} targets in {{`{{`}} $labels.namespace {{`}}`}} namespace are down.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/general/targetdown + summary: One or more targets are unreachable. + expr: 100 * (count(up == 0) BY (cluster, job, namespace, service) / count(up) BY (cluster, job, namespace, service)) > 10 + for: {{ dig "TargetDown" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "TargetDown" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.Watchdog | default false) }} + - alert: Watchdog + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.general }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.general | indent 8 }} +{{- end }} + description: 'This is an alert meant to ensure that the entire alerting pipeline is functional. + + This alert is always firing, therefore it should always be firing in Alertmanager + + and always fire against a receiver. There are integrations with various notification + + mechanisms that send a notification when this alert is not firing. For example the + + "DeadMansSnitch" integration in PagerDuty. + + ' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/general/watchdog + summary: An alert that should always be firing to certify that Alertmanager is working properly. + expr: vector(1) + labels: + severity: {{ dig "Watchdog" "severity" "none" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.InfoInhibitor | default false) }} + - alert: InfoInhibitor + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.general }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.general | indent 8 }} +{{- end }} + description: 'This is an alert that is used to inhibit info alerts. + + By themselves, the info-level alerts are sometimes very noisy, but they are relevant when combined with + + other alerts. + + This alert fires whenever there''s a severity="info" alert, and stops firing when another alert with a + + severity of ''warning'' or ''critical'' starts firing on the same namespace. + + This alert should be routed to a null receiver and configured to inhibit alerts with severity="info". + + ' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/general/infoinhibitor + summary: Info-level alert inhibition. + expr: ALERTS{severity = "info"} == 1 unless on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace) ALERTS{alertname != "InfoInhibitor", severity =~ "warning|critical", alertstate="firing"} == 1 + labels: + severity: {{ dig "InfoInhibitor" "severity" "none" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.general }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-apps.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-apps.yaml new file mode 100644 index 00000000..1337b954 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-apps.yaml @@ -0,0 +1,567 @@ +{{- /* +Generated from 'kubernetes-apps' group from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/kubernetesControlPlane-prometheusRule.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.kubernetesApps }} +{{- $targetNamespace := .Values.defaultRules.appNamespacesTarget }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" .) "kubernetes-apps" | trunc 63 | trimSuffix "-" }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.defaultRules.labels }} +{{ toYaml .Values.defaultRules.labels | indent 4 }} +{{- end }} +{{- if .Values.defaultRules.annotations }} + annotations: +{{ toYaml .Values.defaultRules.annotations | indent 4 }} +{{- end }} +spec: + groups: + - name: kubernetes-apps + rules: +{{- if not (.Values.defaultRules.disabled.KubePodCrashLooping | default false) }} + - alert: KubePodCrashLooping + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: 'Pod {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod {{`}}`}} ({{`{{`}} $labels.container {{`}}`}}) is in waiting state (reason: "CrashLoopBackOff").' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepodcrashlooping + summary: Pod is crash looping. + expr: max_over_time(kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff", namespace=~"{{ $targetNamespace }}"}[5m]) >= 1 + for: {{ dig "KubePodCrashLooping" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePodCrashLooping" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubePodNotReady | default false) }} + - alert: KubePodNotReady + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Pod {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.pod {{`}}`}} has been in a non-ready state for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepodnotready + summary: Pod has been in a non-ready state for more than 15 minutes. + expr: |- + sum by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, cluster) ( + max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, cluster) ( + kube_pod_status_phase{namespace=~"{{ $targetNamespace }}", phase=~"Pending|Unknown|Failed"} + ) * on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, cluster) group_left(owner_kind) topk by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, cluster) ( + 1, max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, owner_kind, cluster) (kube_pod_owner{owner_kind!="Job"}) + ) + ) > 0 + for: {{ dig "KubePodNotReady" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePodNotReady" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDeploymentGenerationMismatch | default false) }} + - alert: KubeDeploymentGenerationMismatch + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Deployment generation for {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.deployment {{`}}`}} does not match, this indicates that the Deployment has failed but has not been rolled back. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedeploymentgenerationmismatch + summary: Deployment generation mismatch due to possible roll-back + expr: |- + kube_deployment_status_observed_generation{namespace=~"{{ $targetNamespace }}"} + != + kube_deployment_metadata_generation{namespace=~"{{ $targetNamespace }}"} + for: {{ dig "KubeDeploymentGenerationMismatch" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDeploymentGenerationMismatch" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDeploymentReplicasMismatch | default false) }} + - alert: KubeDeploymentReplicasMismatch + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Deployment {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.deployment {{`}}`}} has not matched the expected number of replicas for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedeploymentreplicasmismatch + summary: Deployment has not matched the expected number of replicas. + expr: |- + ( + kube_deployment_spec_replicas{namespace=~"{{ $targetNamespace }}"} + > + kube_deployment_status_replicas_available{namespace=~"{{ $targetNamespace }}"} + ) and ( + changes(kube_deployment_status_replicas_updated{namespace=~"{{ $targetNamespace }}"}[10m]) + == + 0 + ) + for: {{ dig "KubeDeploymentReplicasMismatch" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDeploymentReplicasMismatch" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDeploymentRolloutStuck | default false) }} + - alert: KubeDeploymentRolloutStuck + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Rollout of deployment {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.deployment {{`}}`}} is not progressing for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedeploymentrolloutstuck + summary: Deployment rollout is not progressing. + expr: |- + kube_deployment_status_condition{condition="Progressing", status="false",namespace=~"{{ $targetNamespace }}"} + != 0 + for: {{ dig "KubeDeploymentRolloutStuck" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDeploymentRolloutStuck" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeStatefulSetReplicasMismatch | default false) }} + - alert: KubeStatefulSetReplicasMismatch + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: StatefulSet {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.statefulset {{`}}`}} has not matched the expected number of replicas for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubestatefulsetreplicasmismatch + summary: StatefulSet has not matched the expected number of replicas. + expr: |- + ( + kube_statefulset_status_replicas_ready{namespace=~"{{ $targetNamespace }}"} + != + kube_statefulset_status_replicas{namespace=~"{{ $targetNamespace }}"} + ) and ( + changes(kube_statefulset_status_replicas_updated{namespace=~"{{ $targetNamespace }}"}[10m]) + == + 0 + ) + for: {{ dig "KubeStatefulSetReplicasMismatch" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeStatefulSetReplicasMismatch" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeStatefulSetGenerationMismatch | default false) }} + - alert: KubeStatefulSetGenerationMismatch + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: StatefulSet generation for {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.statefulset {{`}}`}} does not match, this indicates that the StatefulSet has failed but has not been rolled back. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubestatefulsetgenerationmismatch + summary: StatefulSet generation mismatch due to possible roll-back + expr: |- + kube_statefulset_status_observed_generation{namespace=~"{{ $targetNamespace }}"} + != + kube_statefulset_metadata_generation{namespace=~"{{ $targetNamespace }}"} + for: {{ dig "KubeStatefulSetGenerationMismatch" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeStatefulSetGenerationMismatch" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeStatefulSetUpdateNotRolledOut | default false) }} + - alert: KubeStatefulSetUpdateNotRolledOut + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: StatefulSet {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.statefulset {{`}}`}} update has not been rolled out. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubestatefulsetupdatenotrolledout + summary: StatefulSet update has not been rolled out. + expr: |- + ( + max without (revision) ( + kube_statefulset_status_current_revision{namespace=~"{{ $targetNamespace }}"} + unless + kube_statefulset_status_update_revision{namespace=~"{{ $targetNamespace }}"} + ) + * + ( + kube_statefulset_replicas{namespace=~"{{ $targetNamespace }}"} + != + kube_statefulset_status_replicas_updated{namespace=~"{{ $targetNamespace }}"} + ) + ) and ( + changes(kube_statefulset_status_replicas_updated{namespace=~"{{ $targetNamespace }}"}[5m]) + == + 0 + ) + for: {{ dig "KubeStatefulSetUpdateNotRolledOut" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeStatefulSetUpdateNotRolledOut" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDaemonSetRolloutStuck | default false) }} + - alert: KubeDaemonSetRolloutStuck + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: DaemonSet {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.daemonset {{`}}`}} has not finished or progressed for at least 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedaemonsetrolloutstuck + summary: DaemonSet rollout is stuck. + expr: |- + ( + ( + kube_daemonset_status_current_number_scheduled{namespace=~"{{ $targetNamespace }}"} + != + kube_daemonset_status_desired_number_scheduled{namespace=~"{{ $targetNamespace }}"} + ) or ( + kube_daemonset_status_number_misscheduled{namespace=~"{{ $targetNamespace }}"} + != + 0 + ) or ( + kube_daemonset_status_updated_number_scheduled{namespace=~"{{ $targetNamespace }}"} + != + kube_daemonset_status_desired_number_scheduled{namespace=~"{{ $targetNamespace }}"} + ) or ( + kube_daemonset_status_number_available{namespace=~"{{ $targetNamespace }}"} + != + kube_daemonset_status_desired_number_scheduled{namespace=~"{{ $targetNamespace }}"} + ) + ) and ( + changes(kube_daemonset_status_updated_number_scheduled{namespace=~"{{ $targetNamespace }}"}[5m]) + == + 0 + ) + for: {{ dig "KubeDaemonSetRolloutStuck" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDaemonSetRolloutStuck" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeContainerWaiting | default false) }} + - alert: KubeContainerWaiting + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: pod/{{`{{`}} $labels.pod {{`}}`}} in namespace {{`{{`}} $labels.namespace {{`}}`}} on container {{`{{`}} $labels.container{{`}}`}} has been in waiting state for longer than 1 hour. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubecontainerwaiting + summary: Pod container waiting longer than 1 hour + expr: sum by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, pod, container, cluster) (kube_pod_container_status_waiting_reason{namespace=~"{{ $targetNamespace }}"}) > 0 + for: {{ dig "KubeContainerWaiting" "for" "1h" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeContainerWaiting" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDaemonSetNotScheduled | default false) }} + - alert: KubeDaemonSetNotScheduled + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: '{{`{{`}} $value {{`}}`}} Pods of DaemonSet {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.daemonset {{`}}`}} are not scheduled.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedaemonsetnotscheduled + summary: DaemonSet pods are not scheduled. + expr: |- + kube_daemonset_status_desired_number_scheduled{namespace=~"{{ $targetNamespace }}"} + - + kube_daemonset_status_current_number_scheduled{namespace=~"{{ $targetNamespace }}"} > 0 + for: {{ dig "KubeDaemonSetNotScheduled" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDaemonSetNotScheduled" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeDaemonSetMisScheduled | default false) }} + - alert: KubeDaemonSetMisScheduled + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: '{{`{{`}} $value {{`}}`}} Pods of DaemonSet {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.daemonset {{`}}`}} are running where they are not supposed to run.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubedaemonsetmisscheduled + summary: DaemonSet pods are misscheduled. + expr: kube_daemonset_status_number_misscheduled{namespace=~"{{ $targetNamespace }}"} > 0 + for: {{ dig "KubeDaemonSetMisScheduled" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeDaemonSetMisScheduled" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeJobNotCompleted | default false) }} + - alert: KubeJobNotCompleted + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Job {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.job_name {{`}}`}} is taking more than {{`{{`}} "43200" | humanizeDuration {{`}}`}} to complete. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubejobnotcompleted + summary: Job did not complete in time + expr: |- + time() - max by ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}namespace, job_name, cluster) (kube_job_status_start_time{namespace=~"{{ $targetNamespace }}"} + and + kube_job_status_active{namespace=~"{{ $targetNamespace }}"} > 0) > 43200 + labels: + severity: {{ dig "KubeJobNotCompleted" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeJobFailed | default false) }} + - alert: KubeJobFailed + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: Job {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.job_name {{`}}`}} failed to complete. Removing failed job after investigation should clear this alert. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubejobfailed + summary: Job failed to complete. + expr: kube_job_failed{namespace=~"{{ $targetNamespace }}"} > 0 + for: {{ dig "KubeJobFailed" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeJobFailed" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeHpaReplicasMismatch | default false) }} + - alert: KubeHpaReplicasMismatch + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: HPA {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.horizontalpodautoscaler {{`}}`}} has not matched the desired number of replicas for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubehpareplicasmismatch + summary: HPA has not matched desired number of replicas. + expr: |- + (kube_horizontalpodautoscaler_status_desired_replicas{namespace=~"{{ $targetNamespace }}"} + != + kube_horizontalpodautoscaler_status_current_replicas{namespace=~"{{ $targetNamespace }}"}) + and + (kube_horizontalpodautoscaler_status_current_replicas{namespace=~"{{ $targetNamespace }}"} + > + kube_horizontalpodautoscaler_spec_min_replicas{namespace=~"{{ $targetNamespace }}"}) + and + (kube_horizontalpodautoscaler_status_current_replicas{namespace=~"{{ $targetNamespace }}"} + < + kube_horizontalpodautoscaler_spec_max_replicas{namespace=~"{{ $targetNamespace }}"}) + and + changes(kube_horizontalpodautoscaler_status_current_replicas{namespace=~"{{ $targetNamespace }}"}[15m]) == 0 + for: {{ dig "KubeHpaReplicasMismatch" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeHpaReplicasMismatch" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubeHpaMaxedOut | default false) }} + - alert: KubeHpaMaxedOut + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesApps | indent 8 }} +{{- end }} + description: HPA {{`{{`}} $labels.namespace {{`}}`}}/{{`{{`}} $labels.horizontalpodautoscaler {{`}}`}} has been running at max replicas for longer than 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubehpamaxedout + summary: HPA is running at max replicas + expr: |- + kube_horizontalpodautoscaler_status_current_replicas{namespace=~"{{ $targetNamespace }}"} + == + kube_horizontalpodautoscaler_spec_max_replicas{namespace=~"{{ $targetNamespace }}"} + for: {{ dig "KubeHpaMaxedOut" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubeHpaMaxedOut" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesApps }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-storage.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-storage.yaml new file mode 100644 index 00000000..7cd7a87f --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/kubernetes-storage.yaml @@ -0,0 +1,206 @@ +{{- /* +Generated from 'kubernetes-storage' group from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/kubernetesControlPlane-prometheusRule.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.kubernetesStorage }} +{{- $targetNamespace := .Values.defaultRules.appNamespacesTarget }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" .) "kubernetes-storage" | trunc 63 | trimSuffix "-" }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.defaultRules.labels }} +{{ toYaml .Values.defaultRules.labels | indent 4 }} +{{- end }} +{{- if .Values.defaultRules.annotations }} + annotations: +{{ toYaml .Values.defaultRules.annotations | indent 4 }} +{{- end }} +spec: + groups: + - name: kubernetes-storage + rules: +{{- if not (.Values.defaultRules.disabled.KubePersistentVolumeFillingUp | default false) }} + - alert: KubePersistentVolumeFillingUp + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesStorage }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesStorage | indent 8 }} +{{- end }} + description: The PersistentVolume claimed by {{`{{`}} $labels.persistentvolumeclaim {{`}}`}} in Namespace {{`{{`}} $labels.namespace {{`}}`}} is only {{`{{`}} $value | humanizePercentage {{`}}`}} free. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepersistentvolumefillingup + summary: PersistentVolume is filling up. + expr: |- + kubelet_volume_stats_available_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + / + kubelet_volume_stats_capacity_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + < 0.03 + and + kubelet_volume_stats_used_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} > 0 + unless on(namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1 + unless on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1 + for: {{ dig "KubePersistentVolumeFillingUp" "for" "1m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePersistentVolumeFillingUp" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubePersistentVolumeFillingUp | default false) }} + - alert: KubePersistentVolumeFillingUp + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} + description: Based on recent sampling, the PersistentVolume claimed by {{`{{`}} $labels.persistentvolumeclaim {{`}}`}} in Namespace {{`{{`}} $labels.namespace {{`}}`}} is expected to fill up within four days. Currently {{`{{`}} $value | humanizePercentage {{`}}`}} is available. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepersistentvolumefillingup + summary: PersistentVolume is filling up. + expr: |- + ( + kubelet_volume_stats_available_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + / + kubelet_volume_stats_capacity_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + ) < 0.15 + and + kubelet_volume_stats_used_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} > 0 + and + predict_linear(kubelet_volume_stats_available_bytes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"}[6h], 4 * 24 * 3600) < 0 + unless on(namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1 + unless on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1 + for: {{ dig "KubePersistentVolumeFillingUp" "for" "1h" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePersistentVolumeFillingUp" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubePersistentVolumeInodesFillingUp | default false) }} + - alert: KubePersistentVolumeInodesFillingUp + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} + description: The PersistentVolume claimed by {{`{{`}} $labels.persistentvolumeclaim {{`}}`}} in Namespace {{`{{`}} $labels.namespace {{`}}`}} only has {{`{{`}} $value | humanizePercentage {{`}}`}} free inodes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepersistentvolumeinodesfillingup + summary: PersistentVolumeInodes are filling up. + expr: |- + ( + kubelet_volume_stats_inodes_free{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + / + kubelet_volume_stats_inodes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + ) < 0.03 + and + kubelet_volume_stats_inodes_used{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} > 0 + unless on(namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1 + unless on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1 + for: {{ dig "KubePersistentVolumeInodesFillingUp" "for" "1m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePersistentVolumeInodesFillingUp" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubePersistentVolumeInodesFillingUp | default false) }} + - alert: KubePersistentVolumeInodesFillingUp + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} + description: Based on recent sampling, the PersistentVolume claimed by {{`{{`}} $labels.persistentvolumeclaim {{`}}`}} in Namespace {{`{{`}} $labels.namespace {{`}}`}} is expected to run out of inodes within four days. Currently {{`{{`}} $value | humanizePercentage {{`}}`}} of its inodes are free. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepersistentvolumeinodesfillingup + summary: PersistentVolumeInodes are filling up. + expr: |- + ( + kubelet_volume_stats_inodes_free{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + / + kubelet_volume_stats_inodes{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} + ) < 0.15 + and + kubelet_volume_stats_inodes_used{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"} > 0 + and + predict_linear(kubelet_volume_stats_inodes_free{namespace=~"{{ $targetNamespace }}", metrics_path="/metrics"}[6h], 4 * 24 * 3600) < 0 + unless on(namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_access_mode{ access_mode="ReadOnlyMany"} == 1 + unless on ({{ range $.Values.defaultRules.additionalAggregationLabels }}{{ . }},{{ end }}cluster, namespace, persistentvolumeclaim) + kube_persistentvolumeclaim_labels{label_excluded_from_alerts="true"} == 1 + for: {{ dig "KubePersistentVolumeInodesFillingUp" "for" "1h" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePersistentVolumeInodesFillingUp" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.KubePersistentVolumeErrors | default false) }} + - alert: KubePersistentVolumeErrors + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesStorage }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.kubernetesStorage | indent 8 }} +{{- end }} + description: The persistent volume {{`{{`}} $labels.persistentvolume {{`}}`}} has status {{`{{`}} $labels.phase {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/kubernetes/kubepersistentvolumeerrors + summary: PersistentVolume is having issues with provisioning. + expr: kube_persistentvolume_status_phase{phase=~"Failed|Pending"} > 0 + for: {{ dig "KubePersistentVolumeErrors" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "KubePersistentVolumeErrors" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.kubernetesStorage }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/prometheus.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/prometheus.yaml new file mode 100644 index 00000000..ddd46ae8 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/rules-1.14/prometheus.yaml @@ -0,0 +1,707 @@ +{{- /* +Generated from 'prometheus' group from https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/prometheus-prometheusRule.yaml +Do not change in-place! In order to change this file first read following link: +https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack/hack +*/ -}} +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if and (semverCompare ">=1.14.0-0" $kubeTargetVersion) (semverCompare "<9.9.9-9" $kubeTargetVersion) .Values.defaultRules.create .Values.defaultRules.rules.prometheus }} +{{- $prometheusJob := printf "%s-%s" (include "project-prometheus-stack.fullname" .) "prometheus" }} +{{- $namespace := printf "%s" (include "project-prometheus-stack.namespace" .) }} +apiVersion: monitoring.coreos.com/v1 +kind: PrometheusRule +metadata: + name: {{ printf "%s-%s" (include "project-prometheus-stack.fullname" .) "prometheus" | trunc 63 | trimSuffix "-" }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.defaultRules.labels }} +{{ toYaml .Values.defaultRules.labels | indent 4 }} +{{- end }} +{{- if .Values.defaultRules.annotations }} + annotations: +{{ toYaml .Values.defaultRules.annotations | indent 4 }} +{{- end }} +spec: + groups: + - name: prometheus + rules: +{{- if not (.Values.defaultRules.disabled.PrometheusBadConfig | default false) }} + - alert: PrometheusBadConfig + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has failed to reload its configuration. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusbadconfig + summary: Failed Prometheus configuration reload. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + max_over_time(prometheus_config_last_reload_successful{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) == 0 + for: {{ dig "PrometheusBadConfig" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusBadConfig" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusSDRefreshFailure | default false) }} + - alert: PrometheusSDRefreshFailure + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has failed to refresh SD with mechanism {{`{{`}}$labels.mechanism{{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheussdrefreshfailure + summary: Failed Prometheus SD refresh. + expr: increase(prometheus_sd_refresh_failures_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[10m]) > 0 + for: {{ dig "PrometheusSDRefreshFailure" "for" "20m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusSDRefreshFailure" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusNotificationQueueRunningFull | default false) }} + - alert: PrometheusNotificationQueueRunningFull + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Alert notification queue of Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} is running full. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusnotificationqueuerunningfull + summary: Prometheus alert notification queue predicted to run full in less than 30m. + expr: |- + # Without min_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + ( + predict_linear(prometheus_notifications_queue_length{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m], 60 * 30) + > + min_over_time(prometheus_notifications_queue_capacity{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + ) + for: {{ dig "PrometheusNotificationQueueRunningFull" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusNotificationQueueRunningFull" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusErrorSendingAlertsToSomeAlertmanagers | default false) }} + - alert: PrometheusErrorSendingAlertsToSomeAlertmanagers + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: '{{`{{`}} printf "%.1f" $value {{`}}`}}% errors while sending alerts from Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} to Alertmanager {{`{{`}}$labels.alertmanager{{`}}`}}.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheuserrorsendingalertstosomealertmanagers + summary: Prometheus has encountered more than 1% errors sending alerts to a specific Alertmanager. + expr: |- + ( + rate(prometheus_notifications_errors_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + / + rate(prometheus_notifications_sent_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + ) + * 100 + > 1 + for: {{ dig "PrometheusErrorSendingAlertsToSomeAlertmanagers" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusErrorSendingAlertsToSomeAlertmanagers" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusNotConnectedToAlertmanagers | default false) }} + - alert: PrometheusNotConnectedToAlertmanagers + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} is not connected to any Alertmanagers. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusnotconnectedtoalertmanagers + summary: Prometheus is not connected to any Alertmanagers. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + max_over_time(prometheus_notifications_alertmanagers_discovered{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) < 1 + for: {{ dig "PrometheusNotConnectedToAlertmanagers" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusNotConnectedToAlertmanagers" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusTSDBReloadsFailing | default false) }} + - alert: PrometheusTSDBReloadsFailing + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has detected {{`{{`}}$value | humanize{{`}}`}} reload failures over the last 3h. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheustsdbreloadsfailing + summary: Prometheus has issues reloading blocks from disk. + expr: increase(prometheus_tsdb_reloads_failures_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[3h]) > 0 + for: {{ dig "PrometheusTSDBReloadsFailing" "for" "4h" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusTSDBReloadsFailing" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusTSDBCompactionsFailing | default false) }} + - alert: PrometheusTSDBCompactionsFailing + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has detected {{`{{`}}$value | humanize{{`}}`}} compaction failures over the last 3h. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheustsdbcompactionsfailing + summary: Prometheus has issues compacting blocks. + expr: increase(prometheus_tsdb_compactions_failed_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[3h]) > 0 + for: {{ dig "PrometheusTSDBCompactionsFailing" "for" "4h" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusTSDBCompactionsFailing" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusNotIngestingSamples | default false) }} + - alert: PrometheusNotIngestingSamples + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} is not ingesting samples. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusnotingestingsamples + summary: Prometheus is not ingesting samples. + expr: |- + ( + sum without(type) (rate(prometheus_tsdb_head_samples_appended_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m])) <= 0 + and + ( + sum without(scrape_job) (prometheus_target_metadata_cache_entries{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}) > 0 + or + sum without(rule_group) (prometheus_rule_group_rules{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}) > 0 + ) + ) + for: {{ dig "PrometheusNotIngestingSamples" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusNotIngestingSamples" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusDuplicateTimestamps | default false) }} + - alert: PrometheusDuplicateTimestamps + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} is dropping {{`{{`}} printf "%.4g" $value {{`}}`}} samples/s with different values but duplicated timestamp. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusduplicatetimestamps + summary: Prometheus is dropping samples with duplicate timestamps. + expr: rate(prometheus_target_scrapes_sample_duplicate_timestamp_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusDuplicateTimestamps" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusDuplicateTimestamps" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusOutOfOrderTimestamps | default false) }} + - alert: PrometheusOutOfOrderTimestamps + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} is dropping {{`{{`}} printf "%.4g" $value {{`}}`}} samples/s with timestamps arriving out of order. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusoutofordertimestamps + summary: Prometheus drops samples with out-of-order timestamps. + expr: rate(prometheus_target_scrapes_sample_out_of_order_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusOutOfOrderTimestamps" "for" "10m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusOutOfOrderTimestamps" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusRemoteStorageFailures | default false) }} + - alert: PrometheusRemoteStorageFailures + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} failed to send {{`{{`}} printf "%.1f" $value {{`}}`}}% of the samples to {{`{{`}} $labels.remote_name{{`}}`}}:{{`{{`}} $labels.url {{`}}`}} + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusremotestoragefailures + summary: Prometheus fails to send samples to remote storage. + expr: |- + ( + (rate(prometheus_remote_storage_failed_samples_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m])) + / + ( + (rate(prometheus_remote_storage_failed_samples_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) or rate(prometheus_remote_storage_samples_failed_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m])) + + + (rate(prometheus_remote_storage_succeeded_samples_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) or rate(prometheus_remote_storage_samples_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m])) + ) + ) + * 100 + > 1 + for: {{ dig "PrometheusRemoteStorageFailures" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusRemoteStorageFailures" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusRemoteWriteBehind | default false) }} + - alert: PrometheusRemoteWriteBehind + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} remote write is {{`{{`}} printf "%.1f" $value {{`}}`}}s behind for {{`{{`}} $labels.remote_name{{`}}`}}:{{`{{`}} $labels.url {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusremotewritebehind + summary: Prometheus remote write is behind. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + ( + max_over_time(prometheus_remote_storage_highest_timestamp_in_seconds{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + - ignoring(remote_name, url) group_right + max_over_time(prometheus_remote_storage_queue_highest_sent_timestamp_seconds{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + ) + > 120 + for: {{ dig "PrometheusRemoteWriteBehind" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusRemoteWriteBehind" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusRemoteWriteDesiredShards | default false) }} + - alert: PrometheusRemoteWriteDesiredShards + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} remote write desired shards calculation wants to run {{`{{`}} $value {{`}}`}} shards for queue {{`{{`}} $labels.remote_name{{`}}`}}:{{`{{`}} $labels.url {{`}}`}}, which is more than the max of {{`{{`}} printf `prometheus_remote_storage_shards_max{instance="%s",job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}` $labels.instance | query | first | value {{`}}`}}. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusremotewritedesiredshards + summary: Prometheus remote write desired shards calculation wants to run more than configured max shards. + expr: |- + # Without max_over_time, failed scrapes could create false negatives, see + # https://www.robustperception.io/alerting-on-gauges-in-prometheus-2-0 for details. + ( + max_over_time(prometheus_remote_storage_shards_desired{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + > + max_over_time(prometheus_remote_storage_shards_max{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) + ) + for: {{ dig "PrometheusRemoteWriteDesiredShards" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusRemoteWriteDesiredShards" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusRuleFailures | default false) }} + - alert: PrometheusRuleFailures + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has failed to evaluate {{`{{`}} printf "%.0f" $value {{`}}`}} rules in the last 5m. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusrulefailures + summary: Prometheus is failing rule evaluations. + expr: increase(prometheus_rule_evaluation_failures_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusRuleFailures" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusRuleFailures" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusMissingRuleEvaluations | default false) }} + - alert: PrometheusMissingRuleEvaluations + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has missed {{`{{`}} printf "%.0f" $value {{`}}`}} rule group evaluations in the last 5m. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusmissingruleevaluations + summary: Prometheus is missing rule evaluations due to slow rule group evaluation. + expr: increase(prometheus_rule_group_iterations_missed_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusMissingRuleEvaluations" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusMissingRuleEvaluations" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusTargetLimitHit | default false) }} + - alert: PrometheusTargetLimitHit + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has dropped {{`{{`}} printf "%.0f" $value {{`}}`}} targets because the number of targets exceeded the configured target_limit. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheustargetlimithit + summary: Prometheus has dropped targets because some scrape configs have exceeded the targets limit. + expr: increase(prometheus_target_scrape_pool_exceeded_target_limit_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusTargetLimitHit" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusTargetLimitHit" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusLabelLimitHit | default false) }} + - alert: PrometheusLabelLimitHit + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has dropped {{`{{`}} printf "%.0f" $value {{`}}`}} targets because some samples exceeded the configured label_limit, label_name_length_limit or label_value_length_limit. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheuslabellimithit + summary: Prometheus has dropped targets because some scrape configs have exceeded the labels limit. + expr: increase(prometheus_target_scrape_pool_exceeded_label_limits_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusLabelLimitHit" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusLabelLimitHit" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusScrapeBodySizeLimitHit | default false) }} + - alert: PrometheusScrapeBodySizeLimitHit + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has failed {{`{{`}} printf "%.0f" $value {{`}}`}} scrapes in the last 5m because some targets exceeded the configured body_size_limit. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusscrapebodysizelimithit + summary: Prometheus has dropped some targets that exceeded body size limit. + expr: increase(prometheus_target_scrapes_exceeded_body_size_limit_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusScrapeBodySizeLimitHit" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusScrapeBodySizeLimitHit" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusScrapeSampleLimitHit | default false) }} + - alert: PrometheusScrapeSampleLimitHit + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} has failed {{`{{`}} printf "%.0f" $value {{`}}`}} scrapes in the last 5m because some targets exceeded the configured sample_limit. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheusscrapesamplelimithit + summary: Prometheus has failed scrapes that have exceeded the configured sample limit. + expr: increase(prometheus_target_scrapes_exceeded_sample_limit_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0 + for: {{ dig "PrometheusScrapeSampleLimitHit" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusScrapeSampleLimitHit" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusTargetSyncFailure | default false) }} + - alert: PrometheusTargetSyncFailure + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: '{{`{{`}} printf "%.0f" $value {{`}}`}} targets in Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} have failed to sync because invalid configuration was supplied.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheustargetsyncfailure + summary: Prometheus has failed to sync targets. + expr: increase(prometheus_target_sync_failed_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[30m]) > 0 + for: {{ dig "PrometheusTargetSyncFailure" "for" "5m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusTargetSyncFailure" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusHighQueryLoad | default false) }} + - alert: PrometheusHighQueryLoad + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} query API has less than 20% available capacity in its query engine for the last 15 minutes. + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheushighqueryload + summary: Prometheus is reaching its maximum capacity serving concurrent requests. + expr: avg_over_time(prometheus_engine_queries{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) / max_over_time(prometheus_engine_queries_concurrent_max{job="{{ $prometheusJob }}",namespace="{{ $namespace }}"}[5m]) > 0.8 + for: {{ dig "PrometheusHighQueryLoad" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusHighQueryLoad" "severity" "warning" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- if not (.Values.defaultRules.disabled.PrometheusErrorSendingAlertsToAnyAlertmanager | default false) }} + - alert: PrometheusErrorSendingAlertsToAnyAlertmanager + annotations: +{{- if .Values.defaultRules.additionalRuleAnnotations }} +{{ toYaml .Values.defaultRules.additionalRuleAnnotations | indent 8 }} +{{- end }} +{{- if .Values.defaultRules.additionalRuleGroupAnnotations.prometheus }} +{{ toYaml .Values.defaultRules.additionalRuleGroupAnnotations.prometheus | indent 8 }} +{{- end }} + description: '{{`{{`}} printf "%.1f" $value {{`}}`}}% minimum errors while sending alerts from Prometheus {{`{{`}}$labels.namespace{{`}}`}}/{{`{{`}}$labels.pod{{`}}`}} to any Alertmanager.' + runbook_url: {{ .Values.defaultRules.runbookUrl }}/prometheus/prometheuserrorsendingalertstoanyalertmanager + summary: Prometheus encounters more than 3% errors sending alerts to any Alertmanager. + expr: |- + min without (alertmanager) ( + rate(prometheus_notifications_errors_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}",alertmanager!~``}[5m]) + / + rate(prometheus_notifications_sent_total{job="{{ $prometheusJob }}",namespace="{{ $namespace }}",alertmanager!~``}[5m]) + ) + * 100 + > 3 + for: {{ dig "PrometheusErrorSendingAlertsToAnyAlertmanager" "for" "15m" .Values.customRules }} + {{- with .Values.defaultRules.keepFiringFor }} + keep_firing_for: "{{ . }}" + {{- end }} + labels: + severity: {{ dig "PrometheusErrorSendingAlertsToAnyAlertmanager" "severity" "critical" .Values.customRules }} + {{- if or .Values.defaultRules.additionalRuleLabels .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- with .Values.defaultRules.additionalRuleLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.defaultRules.additionalRuleGroupLabels.prometheus }} + {{- toYaml . | nindent 8 }} + {{- end }} + {{- end }} +{{- end }} +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/secret.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/secret.yaml new file mode 100644 index 00000000..1cbb2b2c --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/secret.yaml @@ -0,0 +1,15 @@ +{{- if and .Values.prometheus.enabled .Values.prometheus.prometheusSpec.thanos .Values.prometheus.prometheusSpec.thanos.objectStorageConfig}} +{{- if and .Values.prometheus.prometheusSpec.thanos.objectStorageConfig.secret (not .Values.prometheus.prometheusSpec.thanos.objectStorageConfig.existingSecret) }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus + app.kubernetes.io/component: prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +data: + object-storage-configs.yaml: {{ toYaml .Values.prometheus.prometheusSpec.thanos.objectStorageConfig.secret | b64enc | quote }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/service.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/service.yaml new file mode 100644 index 00000000..058d6645 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/service.yaml @@ -0,0 +1,72 @@ +{{- $kubeTargetVersion := default .Capabilities.KubeVersion.GitVersion .Values.kubeTargetVersionOverride }} +{{- if .Values.prometheus.enabled }} +apiVersion: v1 +kind: Service +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus1 + self-monitor: {{ .Values.prometheus.serviceMonitor.selfMonitor | quote }} +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.prometheus.service.labels }} +{{ toYaml .Values.prometheus.service.labels | indent 4 }} +{{- end }} +{{- if .Values.prometheus.service.annotations }} + annotations: +{{ toYaml .Values.prometheus.service.annotations | indent 4 }} +{{- end }} +spec: +{{- if .Values.prometheus.service.clusterIP }} + clusterIP: {{ .Values.prometheus.service.clusterIP }} +{{- end }} +{{- if .Values.prometheus.service.externalIPs }} + externalIPs: +{{ toYaml .Values.prometheus.service.externalIPs | indent 4 }} +{{- end }} +{{- if .Values.prometheus.service.loadBalancerIP }} + loadBalancerIP: {{ .Values.prometheus.service.loadBalancerIP }} +{{- end }} +{{- if .Values.prometheus.service.loadBalancerSourceRanges }} + loadBalancerSourceRanges: + {{- range $cidr := .Values.prometheus.service.loadBalancerSourceRanges }} + - {{ $cidr }} + {{- end }} +{{- end }} +{{- if ne .Values.prometheus.service.type "ClusterIP" }} + externalTrafficPolicy: {{ .Values.prometheus.service.externalTrafficPolicy }} +{{- end }} + ports: + - name: {{ .Values.prometheus.prometheusSpec.portName }} + {{- if eq .Values.prometheus.service.type "NodePort" }} + nodePort: {{ .Values.prometheus.service.nodePort }} + {{- end }} + port: {{ .Values.prometheus.service.port }} + targetPort: {{ .Values.prometheus.service.targetPort }} + - name: reloader-web + {{- if semverCompare "> 1.20.0-0" $kubeTargetVersion }} + appProtocol: http + {{- end }} + port: {{ .Values.prometheus.service.reloaderWebPort }} + targetPort: reloader-web +{{- if .Values.prometheus.service.additionalPorts }} +{{ toYaml .Values.prometheus.service.additionalPorts | indent 2 }} +{{- end }} + publishNotReadyAddresses: {{ .Values.prometheus.service.publishNotReadyAddresses }} + selector: + {{- if .Values.prometheus.agentMode }} + app.kubernetes.io/name: prometheus-agent + {{- else }} + app.kubernetes.io/name: prometheus + {{- end }} + operator.prometheus.io/name: {{ template "project-prometheus-stack.prometheus.crname" . }} +{{- if .Values.prometheus.service.sessionAffinity }} + sessionAffinity: {{ .Values.prometheus.service.sessionAffinity }} +{{- end }} +{{- if eq .Values.prometheus.service.sessionAffinity "ClientIP" }} + sessionAffinityConfig: + clientIP: + timeoutSeconds: {{ .Values.prometheus.service.sessionAffinityConfig.clientIP.timeoutSeconds }} +{{- end }} + type: "{{ .Values.prometheus.service.type }}" +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/serviceaccount.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/serviceaccount.yaml new file mode 100644 index 00000000..b4d44164 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/serviceaccount.yaml @@ -0,0 +1,20 @@ +{{- if and .Values.prometheus.enabled .Values.prometheus.serviceAccount.create }} +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ template "project-prometheus-stack.prometheus.serviceAccountName" . }} + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus + app.kubernetes.io/name: {{ template "project-prometheus-stack.name" . }}-prometheus + app.kubernetes.io/component: prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- if .Values.prometheus.serviceAccount.annotations }} + annotations: +{{ toYaml .Values.prometheus.serviceAccount.annotations | indent 4 }} +{{- end }} +{{- if .Values.global.imagePullSecrets }} +imagePullSecrets: +{{ include "project-prometheus-stack.imagePullSecrets" . | trim | indent 2 }} +{{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/prometheus/servicemonitor.yaml b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/servicemonitor.yaml new file mode 100644 index 00000000..6a36678b --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/prometheus/servicemonitor.yaml @@ -0,0 +1,97 @@ +{{- if and .Values.prometheus.enabled .Values.prometheus.serviceMonitor.selfMonitor }} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ template "project-prometheus-stack.fullname" . }}-prometheus + namespace: {{ template "project-prometheus-stack.namespace" . }} + labels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus +{{ include "project-prometheus-stack.labels" . | indent 4 }} +{{- with .Values.prometheus.serviceMonitor.additionalLabels }} +{{- toYaml . | nindent 4 }} +{{- end }} +spec: + {{- include "servicemonitor.scrapeLimits" .Values.prometheus.serviceMonitor | nindent 2 }} + selector: + matchLabels: + app: {{ template "project-prometheus-stack.name" . }}-prometheus + release: {{ $.Release.Name | quote }} + self-monitor: "true" + namespaceSelector: + matchNames: + - {{ printf "%s" (include "project-prometheus-stack.namespace" .) | quote }} + endpoints: + - port: {{ .Values.prometheus.prometheusSpec.portName }} + {{- if .Values.prometheus.serviceMonitor.interval }} + interval: {{ .Values.prometheus.serviceMonitor.interval }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.scheme }} + scheme: {{ .Values.prometheus.serviceMonitor.scheme }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.tlsConfig }} + tlsConfig: {{- toYaml .Values.prometheus.serviceMonitor.tlsConfig | nindent 6 }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.bearerTokenFile }} + bearerTokenFile: {{ .Values.prometheus.serviceMonitor.bearerTokenFile }} + {{- end }} + path: "{{ trimSuffix "/" .Values.prometheus.prometheusSpec.routePrefix }}/metrics" + metricRelabelings: + {{- if .Values.prometheus.serviceMonitor.metricRelabelings }} + {{- tpl (toYaml .Values.prometheus.serviceMonitor.metricRelabelings | nindent 6) . }} + {{- end }} + {{ if .Values.global.cattle.clusterId }} + - sourceLabels: [__address__] + targetLabel: cluster_id + replacement: {{ .Values.global.cattle.clusterId }} + {{- end }} + {{ if .Values.global.cattle.clusterName}} + - sourceLabels: [__address__] + targetLabel: cluster_name + replacement: {{ .Values.global.cattle.clusterName }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.relabelings }} + relabelings: {{- toYaml .Values.prometheus.serviceMonitor.relabelings | nindent 6 }} + {{- end }} + - port: reloader-web + {{- if .Values.prometheus.serviceMonitor.interval }} + interval: {{ .Values.prometheus.serviceMonitor.interval }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.scheme }} + scheme: {{ .Values.prometheus.serviceMonitor.scheme }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.tlsConfig }} + tlsConfig: {{- toYaml .Values.prometheus.serviceMonitor.tlsConfig | nindent 6 }} + {{- end }} + path: "/metrics" + {{- if .Values.prometheus.serviceMonitor.metricRelabelings }} + metricRelabelings: {{- tpl (toYaml .Values.prometheus.serviceMonitor.metricRelabelings | nindent 6) . }} + {{- end }} + {{- if .Values.prometheus.serviceMonitor.relabelings }} + relabelings: {{- toYaml .Values.prometheus.serviceMonitor.relabelings | nindent 6 }} + {{- end }} + {{- range .Values.prometheus.serviceMonitor.additionalEndpoints }} + - port: {{ .port }} + {{- if or $.Values.prometheus.serviceMonitor.interval .interval }} + interval: {{ default $.Values.prometheus.serviceMonitor.interval .interval }} + {{- end }} + {{- if or $.Values.prometheus.serviceMonitor.proxyUrl .proxyUrl }} + proxyUrl: {{ default $.Values.prometheus.serviceMonitor.proxyUrl .proxyUrl }} + {{- end }} + {{- if or $.Values.prometheus.serviceMonitor.scheme .scheme }} + scheme: {{ default $.Values.prometheus.serviceMonitor.scheme .scheme }} + {{- end }} + {{- if or $.Values.prometheus.serviceMonitor.bearerTokenFile .bearerTokenFile }} + bearerTokenFile: {{ default $.Values.prometheus.serviceMonitor.bearerTokenFile .bearerTokenFile }} + {{- end }} + {{- if or $.Values.prometheus.serviceMonitor.tlsConfig .tlsConfig }} + tlsConfig: {{- default $.Values.prometheus.serviceMonitor.tlsConfig .tlsConfig | toYaml | nindent 6 }} + {{- end }} + path: {{ .path }} + {{- if or $.Values.prometheus.serviceMonitor.metricRelabelings .metricRelabelings }} + metricRelabelings: {{- tpl (default $.Values.prometheus.serviceMonitor.metricRelabelings .metricRelabelings | toYaml | nindent 6) . }} + {{- end }} + {{- if or $.Values.prometheus.serviceMonitor.relabelings .relabelings }} + relabelings: {{- default $.Values.prometheus.serviceMonitor.relabelings .relabelings | toYaml | nindent 6 }} + {{- end }} + {{- end }} +{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/templates/rancher-monitoring/hardened.yaml b/charts/rancher-project-monitoring/0.4.2/templates/rancher-monitoring/hardened.yaml new file mode 100644 index 00000000..d87b7535 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/rancher-monitoring/hardened.yaml @@ -0,0 +1,132 @@ +{{- $namespaces := dict "_0" .Release.Namespace -}} +{{- if and .Values.grafana.enabled .Values.grafana.defaultDashboardsEnabled (not .Values.grafana.defaultDashboards.useExistingNamespace) -}} +{{- $_ := set $namespaces "_1" .Values.grafana.defaultDashboards.namespace -}} +{{- end -}} +apiVersion: batch/v1 +kind: Job +metadata: + name: {{ .Chart.Name }}-patch-sa + namespace: {{ .Release.Namespace }} + labels: + app: {{ .Chart.Name }}-patch-sa + annotations: + "helm.sh/hook": post-install, post-upgrade + "helm.sh/hook-delete-policy": hook-succeeded, before-hook-creation +spec: + template: + metadata: + name: {{ .Chart.Name }}-patch-sa + labels: + app: {{ .Chart.Name }}-patch-sa + spec: + serviceAccountName: {{ .Chart.Name }}-patch-sa + securityContext: + runAsNonRoot: true + runAsUser: 1000 + restartPolicy: Never + nodeSelector: {{ include "linux-node-selector" . | nindent 8 }} + tolerations: {{ include "linux-node-tolerations" . | nindent 8 }} + containers: + {{- range $_, $ns := $namespaces }} + - name: patch-sa-{{ $ns }} + image: {{ template "system_default_registry" $ }}{{ $.Values.global.kubectl.repository }}:{{ $.Values.global.kubectl.tag }} + imagePullPolicy: {{ $.Values.global.kubectl.pullPolicy }} + command: ["kubectl", "patch", "serviceaccount", "default", "-p", "{\"automountServiceAccountToken\": false}"] + args: ["-n", "{{ $ns }}"] +{{- if $.Values.global.kubectl.resources }} + resources: {{ toYaml $.Values.global.kubectl.resources | nindent 10 }} +{{- end }} +{{- if $.Values.global.kubectl.containerSecurityContext }} + securityContext: {{ toYaml $.Values.global.kubectl.containerSecurityContext | nindent 10 }} +{{- end }} + {{- end }} +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: {{ .Chart.Name }}-patch-sa + labels: + app: {{ .Chart.Name }}-patch-sa +rules: +- apiGroups: + - "" + resources: + - serviceaccounts + verbs: ['get', 'patch'] +{{- if .Values.global.cattle.psp.enabled }} +- apiGroups: ['policy'] + resources: ['podsecuritypolicies'] + verbs: ['use'] + resourceNames: + - {{ .Chart.Name }}-patch-sa +{{- end }} +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: {{ .Chart.Name }}-patch-sa + labels: + app: {{ .Chart.Name }}-patch-sa +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: {{ .Chart.Name }}-patch-sa +subjects: +- kind: ServiceAccount + name: {{ .Chart.Name }}-patch-sa + namespace: {{ .Release.Namespace }} +--- +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ .Chart.Name }}-patch-sa + namespace: {{ .Release.Namespace }} + labels: + app: {{ .Chart.Name }}-patch-sa +--- +{{- if .Values.global.cattle.psp.enabled }} +apiVersion: policy/v1beta1 +kind: PodSecurityPolicy +metadata: + name: {{ .Chart.Name }}-patch-sa + namespace: {{ .Release.Namespace }} + labels: + app: {{ .Chart.Name }}-patch-sa +spec: + privileged: false + hostNetwork: false + hostIPC: false + hostPID: false + runAsUser: + rule: 'MustRunAsNonRoot' + seLinux: + rule: 'RunAsAny' + supplementalGroups: + rule: 'MustRunAs' + ranges: + - min: 1 + max: 65535 + fsGroup: + rule: 'MustRunAs' + ranges: + - min: 1 + max: 65535 + readOnlyRootFilesystem: false + volumes: + - 'secret' +{{- end }} +{{- range $_, $ns := $namespaces }} +--- +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: project-monitoring-policy + namespace: {{ $ns }} +spec: + podSelector: {} + ingress: {{- default (list dict | toYaml) (include "project-prometheus-stack.hardened.networkPolicy.ingress" (list $ $ns)) | nindent 4 }} + egress: {{- $.Values.global.networkPolicy.egress | toYaml | nindent 4 }} + policyTypes: + - Ingress + - Egress +{{- end }} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/validate-install-crd.yaml b/charts/rancher-project-monitoring/0.4.2/templates/validate-install-crd.yaml new file mode 100644 index 00000000..6fcb8b3a --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/validate-install-crd.yaml @@ -0,0 +1,23 @@ +#{{- if gt (len (lookup "rbac.authorization.k8s.io/v1" "ClusterRole" "" "")) 0 -}} +# {{- $found := dict -}} +# {{- set $found "monitoring.coreos.com/v1alpha1/AlertmanagerConfig" false -}} +# {{- set $found "monitoring.coreos.com/v1/Alertmanager" false -}} +# {{- set $found "monitoring.coreos.com/v1/PodMonitor" false -}} +# {{- set $found "monitoring.coreos.com/v1/Probe" false -}} +# {{- set $found "monitoring.coreos.com/v1alpha1/PrometheusAgent" false -}} +# {{- set $found "monitoring.coreos.com/v1/Prometheus" false -}} +# {{- set $found "monitoring.coreos.com/v1/PrometheusRule" false -}} +# {{- set $found "monitoring.coreos.com/v1alpha1/ScrapeConfig" false -}} +# {{- set $found "monitoring.coreos.com/v1/ServiceMonitor" false -}} +# {{- set $found "monitoring.coreos.com/v1/ThanosRuler" false -}} +# {{- range .Capabilities.APIVersions -}} +# {{- if hasKey $found (toString .) -}} +# {{- set $found (toString .) true -}} +# {{- end -}} +# {{- end -}} +# {{- range $_, $exists := $found -}} +# {{- if (eq $exists false) -}} +# {{- required "Required CRDs are missing. Please install the corresponding CRD chart before installing this chart." "" -}} +# {{- end -}} +# {{- end -}} +#{{- end -}} \ No newline at end of file diff --git a/charts/rancher-project-monitoring/0.4.2/templates/validate-psp-install.yaml b/charts/rancher-project-monitoring/0.4.2/templates/validate-psp-install.yaml new file mode 100644 index 00000000..a30c59d3 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/templates/validate-psp-install.yaml @@ -0,0 +1,7 @@ +#{{- if gt (len (lookup "rbac.authorization.k8s.io/v1" "ClusterRole" "" "")) 0 -}} +#{{- if .Values.global.cattle.psp.enabled }} +#{{- if not (.Capabilities.APIVersions.Has "policy/v1beta1/PodSecurityPolicy") }} +#{{- fail "The target cluster does not have the PodSecurityPolicy API resource. Please disable PSPs in this chart before proceeding." -}} +#{{- end }} +#{{- end }} +#{{- end }} diff --git a/charts/rancher-project-monitoring/0.4.2/values.yaml b/charts/rancher-project-monitoring/0.4.2/values.yaml new file mode 100644 index 00000000..19498a47 --- /dev/null +++ b/charts/rancher-project-monitoring/0.4.2/values.yaml @@ -0,0 +1,2487 @@ +# Default values for project-prometheus-stack. +# This is a YAML-formatted file. +# Declare variables to be passed into your templates. + +# Rancher Project Monitoring Configuration +## Rancher Monitoring +## + +rancherMonitoring: + enabled: true + + ## A namespaceSelector to identify the namespace to find the Rancher deployment + ## + namespaceSelector: + matchNames: + - cattle-system + + ## A selector to identify the Rancher deployment + ## If not set, the chart will try to search for the Rancher deployment in the cattle-system namespace and infer the selector values from it + ## If the Rancher deployment does not exist, no resources will be deployed. + ## + selector: {} + +## Provide a name in place of project-prometheus-stack for `app:` labels +## NOTE: If you change this value, you must update the prometheus-adapter.prometheus.url +## +nameOverride: "rancher-project-monitoring" + +## Override the deployment namespace +## NOTE: If you change this value, you must update the prometheus-adapter.prometheus.url +## +namespaceOverride: "" + +## Provide a k8s version to auto dashboard import script example: kubeTargetVersionOverride: 1.16.6 +## +kubeTargetVersionOverride: "" + +## Allow kubeVersion to be overridden while creating the ingress +## +kubeVersionOverride: "" + +## Provide a name to substitute for the full names of resources +## +fullnameOverride: "" + +## Labels to apply to all resources +## +commonLabels: {} +# scmhash: abc123 +# myLabel: aakkmd + +## custom Rules to override "for" and "severity" in defaultRules +## +customRules: {} + +## Create default rules for monitoring the cluster +## +defaultRules: + create: true + rules: + general: true + prometheus: true + alertmanager: true + kubernetesApps: true + kubernetesStorage: true + + ## Reduce app namespace alert scope + appNamespacesTarget: ".*" + + ## Labels for default rules + labels: {} + ## Annotations for default rules + annotations: {} + + ## Additional labels for PrometheusRule alerts + additionalRuleLabels: {} + + ## Additional annotations for PrometheusRule alerts + additionalRuleAnnotations: {} + + ## Additional labels for specific PrometheusRule alert groups + additionalRuleGroupLabels: + alertmanager: {} + etcd: {} + configReloaders: {} + general: {} + k8sContainerCpuUsageSecondsTotal: {} + k8sContainerMemoryCache: {} + k8sContainerMemoryRss: {} + k8sContainerMemorySwap: {} + k8sContainerResource: {} + k8sPodOwner: {} + kubeApiserverAvailability: {} + kubeApiserverBurnrate: {} + kubeApiserverHistogram: {} + kubeApiserverSlos: {} + kubeControllerManager: {} + kubelet: {} + kubeProxy: {} + kubePrometheusGeneral: {} + kubePrometheusNodeRecording: {} + kubernetesApps: {} + kubernetesResources: {} + kubernetesStorage: {} + kubernetesSystem: {} + kubeSchedulerAlerting: {} + kubeSchedulerRecording: {} + kubeStateMetrics: {} + network: {} + node: {} + nodeExporterAlerting: {} + nodeExporterRecording: {} + prometheus: {} + prometheusOperator: {} + + ## Additional annotations for specific PrometheusRule alerts groups + additionalRuleGroupAnnotations: + alertmanager: {} + etcd: {} + configReloaders: {} + general: {} + k8sContainerCpuUsageSecondsTotal: {} + k8sContainerMemoryCache: {} + k8sContainerMemoryRss: {} + k8sContainerMemorySwap: {} + k8sContainerResource: {} + k8sPodOwner: {} + kubeApiserverAvailability: {} + kubeApiserverBurnrate: {} + kubeApiserverHistogram: {} + kubeApiserverSlos: {} + kubeControllerManager: {} + kubelet: {} + kubeProxy: {} + kubePrometheusGeneral: {} + kubePrometheusNodeRecording: {} + kubernetesApps: {} + kubernetesResources: {} + kubernetesStorage: {} + kubernetesSystem: {} + kubeSchedulerAlerting: {} + kubeSchedulerRecording: {} + kubeStateMetrics: {} + network: {} + node: {} + nodeExporterAlerting: {} + nodeExporterRecording: {} + prometheus: {} + prometheusOperator: {} + + ## Prefix for runbook URLs. Use this to override the first part of the runbookURLs that is common to all rules. + runbookUrl: "https://runbooks.prometheus-operator.dev/runbooks" + + ## Disabled PrometheusRule alerts + disabled: {} + +## +global: + cattle: + psp: + enabled: false + systemDefaultRegistry: "" + projectNamespaceSelector: {} + projectNamespaces: [] + kubectl: + repository: rancher/kubectl + tag: v1.20.2 + pullPolicy: IfNotPresent + securityContext: + runAsNonRoot: true + runAsUser: 1000 + networkPolicy: + # If activated, creates ingress network policies to only allow ingress traffic from workloads within the project. + # This only works correctly, if Project Network Isolation is activated for the cluster in Rancher. Otherwise, + # Ingress traffic from the nodes and thus from the Kubernetes API will be blocked, which breaks accessing the UIs + # through the Rancher/Kubernetes API Proxy in the Rancher UI. + limitIngressToProject: false + # Custom ingress restrictions. If null and limitIngressToProject=false, all ingress traffic will be allowed. + ingress: null + # By default, all egress traffic is allowed. + egress: + - {} + rbac: + ## Create RBAC resources for ServiceAccounts and users + ## + create: true + + userRoles: + ## Create default user ClusterRoles to allow users to interact with Prometheus CRs, ConfigMaps, and Secrets + ## + ## How does this work? + ## + ## The operator will watch for all subjects bound to each Kubernetes default ClusterRole in the project registration namespace + ## where the ProjectHelmChart that deployed this chart belongs to; if it observes a subject bound to a particular role in + ## the project registration namespace (e.g. edit) and if a Role exists that is deployed by this chart with the label + ## 'helm.cattle.io/project-helm-chart-role-aggregate-from': '', it will automaticaly create a RoleBinding + ## in the release namespace binding all such subjects to that Role. + ## + ## Note: while the default behavior is to use the Kubernetes default ClusterRole, the operator deployment (prometheus-federator) + ## can be configured to use a different set of ClusterRoles as the source of truth for admin, edit, and view permissions. + ## + create: true + ## Aggregate default user ClusterRoles into default k8s ClusterRoles + aggregateToDefaultRoles: true + + pspAnnotations: {} + ## Specify pod annotations + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp + ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl + ## + # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*' + # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default' + # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' + + ## Reference to one or more secrets to be used when pulling images + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ + ## + imagePullSecrets: [] + # - name: "image-pull-secret" + # or + # - "image-pull-secret" + +windowsMonitoring: + ## Deploys the windows-exporter and Windows-specific dashboards and rules (job name must be 'windows-exporter') + enabled: false + +federate: + ## enabled indicates whether to add federation to any Project Prometheus Stacks by default + ## If not enabled, no federation will be turned on + enabled: true + + # Change this to point at all Prometheuses you want all your Project Prometheus Stacks to federate from + # By default, this matches the default deployment of Rancher Monitoring + targets: + - rancher-monitoring-prometheus.cattle-monitoring-system.svc:9090 + + ## Scrape interval + interval: "15s" + +## Configuration for alertmanager +## ref: https://prometheus.io/docs/alerting/alertmanager/ +## +alertmanager: + + ## Deploy alertmanager + ## + enabled: true + + ## Annotations for Alertmanager + ## + annotations: {} + + ## Api that prometheus will use to communicate with alertmanager. Possible values are v1, v2 + ## + apiVersion: v2 + + ## Service account for Alertmanager to use. + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/ + ## + serviceAccount: + create: true + name: "" + annotations: {} + automountServiceAccountToken: true + + ## Configure pod disruption budgets for Alertmanager + ## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget + ## This configuration is immutable once created and will require the PDB to be deleted to be changed + ## https://github.com/kubernetes/kubernetes/issues/45398 + ## + podDisruptionBudget: + enabled: false + minAvailable: 1 + maxUnavailable: "" + + ## Alertmanager configuration directives + ## ref: https://prometheus.io/docs/alerting/configuration/#configuration-file + ## https://prometheus.io/webtools/alerting/routing-tree-editor/ + ## + config: + global: + resolve_timeout: 5m + inhibit_rules: + - source_matchers: + - 'severity = critical' + target_matchers: + - 'severity =~ warning|info' + equal: + - 'namespace' + - 'alertname' + - source_matchers: + - 'severity = warning' + target_matchers: + - 'severity = info' + equal: + - 'namespace' + - 'alertname' + - source_matchers: + - 'alertname = InfoInhibitor' + target_matchers: + - 'severity = info' + equal: + - 'namespace' + - target_matchers: + - 'alertname = InfoInhibitor' + route: + group_by: ['namespace'] + group_wait: 30s + group_interval: 5m + repeat_interval: 12h + receiver: 'null' + routes: + - receiver: 'null' + matchers: + - alertname =~ "InfoInhibitor|Watchdog" + receivers: + - name: 'null' + templates: + - '/etc/alertmanager/config/*.tmpl' + + ## Pass the Alertmanager configuration directives through Helm's templating + ## engine. If the Alertmanager configuration contains Alertmanager templates, + ## they'll need to be properly escaped so that they are not interpreted by + ## Helm + ## ref: https://helm.sh/docs/developing_charts/#using-the-tpl-function + ## https://prometheus.io/docs/alerting/configuration/#tmpl_string + ## https://prometheus.io/docs/alerting/notifications/ + ## https://prometheus.io/docs/alerting/notification_examples/ + tplConfig: false + + ## Alertmanager template files to format alerts + ## By default, templateFiles are placed in /etc/alertmanager/config/ and if + ## they have a .tmpl file suffix will be loaded. See config.templates above + ## to change, add other suffixes. If adding other suffixes, be sure to update + ## config.templates above to include those suffixes. + ## ref: https://prometheus.io/docs/alerting/notifications/ + ## https://prometheus.io/docs/alerting/notification_examples/ + ## + + templateFiles: + rancher_defaults.tmpl: |- + {{- define "slack.rancher.text" -}} + {{ template "rancher.text_multiple" . }} + {{- end -}} + + {{- define "rancher.text_multiple" -}} + *[GROUP - Details]* + One or more alarms in this group have triggered a notification. + + {{- if gt (len .GroupLabels.Values) 0 }} + *Group Labels:* + {{- range .GroupLabels.SortedPairs }} + • *{{ .Name }}:* `{{ .Value }}` + {{- end }} + {{- end }} + {{- if .ExternalURL }} + *Link to AlertManager:* {{ .ExternalURL }} + {{- end }} + + {{- range .Alerts }} + {{ template "rancher.text_single" . }} + {{- end }} + {{- end -}} + + {{- define "rancher.text_single" -}} + {{- if .Labels.alertname }} + *[ALERT - {{ .Labels.alertname }}]* + {{- else }} + *[ALERT]* + {{- end }} + {{- if .Labels.severity }} + *Severity:* `{{ .Labels.severity }}` + {{- end }} + {{- if .Labels.cluster }} + *Cluster:* {{ .Labels.cluster }} + {{- end }} + {{- if .Annotations.summary }} + *Summary:* {{ .Annotations.summary }} + {{- end }} + {{- if .Annotations.message }} + *Message:* {{ .Annotations.message }} + {{- end }} + {{- if .Annotations.description }} + *Description:* {{ .Annotations.description }} + {{- end }} + {{- if .Annotations.runbook_url }} + *Runbook URL:* <{{ .Annotations.runbook_url }}|:spiral_note_pad:> + {{- end }} + {{- with .Labels }} + {{- with .Remove (stringSlice "alertname" "severity" "cluster") }} + {{- if gt (len .) 0 }} + *Additional Labels:* + {{- range .SortedPairs }} + • *{{ .Name }}:* `{{ .Value }}` + {{- end }} + {{- end }} + {{- end }} + {{- end }} + {{- with .Annotations }} + {{- with .Remove (stringSlice "summary" "message" "description" "runbook_url") }} + {{- if gt (len .) 0 }} + *Additional Annotations:* + {{- range .SortedPairs }} + • *{{ .Name }}:* `{{ .Value }}` + {{- end }} + {{- end }} + {{- end }} + {{- end }} + {{- end -}} + + ingress: + enabled: false + + # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName + # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress + # ingressClassName: nginx + + annotations: {} + + labels: {} + + ## Override ingress to a different defined port on the service + # servicePort: 8081 + ## Override ingress to a different service then the default, this is useful if you need to + ## point to a specific instance of the alertmanager (eg kube-prometheus-stack-alertmanager-0) + # serviceName: kube-prometheus-stack-alertmanager-0 + + ## Hosts must be provided if Ingress is enabled. + ## + hosts: [] + # - alertmanager.domain.com + + ## Paths to use for ingress rules - one path should match the alertmanagerSpec.routePrefix + ## + paths: [] + # - / + + ## For Kubernetes >= 1.18 you should specify the pathType (determines how Ingress paths should be matched) + ## See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#better-path-matching-with-path-types + # pathType: ImplementationSpecific + + ## TLS configuration for Alertmanager Ingress + ## Secret must be manually created in the namespace + ## + tls: [] + # - secretName: alertmanager-general-tls + # hosts: + # - alertmanager.example.com + + ## Configuration for Alertmanager secret + ## + secret: + annotations: {} + + ## Configuration for Alertmanager service + ## + service: + annotations: {} + labels: {} + clusterIP: "" + + ## Port for Alertmanager Service to listen on + ## + port: 9093 + ## To be used with a proxy extraContainer port + ## + targetPort: 9093 + ## Port to expose on each node + ## Only used if service.type is 'NodePort' + ## + nodePort: 30903 + ## List of IP addresses at which the Prometheus server service is available + ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips + ## + + ## Additional ports to open for Alertmanager service + ## + additionalPorts: [] + # - name: oauth-proxy + # port: 8081 + # targetPort: 8081 + # - name: oauth-metrics + # port: 8082 + # targetPort: 8082 + + externalIPs: [] + loadBalancerIP: "" + loadBalancerSourceRanges: [] + + ## Denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints + ## + externalTrafficPolicy: Cluster + + ## Service type + ## + type: ClusterIP + + ## If true, create a serviceMonitor for alertmanager + ## + serviceMonitor: + ## Scrape interval. If not set, the Prometheus default scrape interval is used. + ## + interval: "" + selfMonitor: true + + ## proxyUrl: URL of a proxy that should be used for scraping. + ## + proxyUrl: "" + + ## scheme: HTTP scheme to use for scraping. Can be used with `tlsConfig` for example if using istio mTLS. + scheme: "" + + ## tlsConfig: TLS configuration to use when scraping the endpoint. For example if using istio mTLS. + ## Of type: https://github.com/coreos/prometheus-operator/blob/main/Documentation/api.md#tlsconfig + tlsConfig: {} + + bearerTokenFile: + + ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig + ## + metricRelabelings: [] + # - action: keep + # regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+' + # sourceLabels: [__name__] + + ## RelabelConfigs to apply to samples before scraping + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig + ## + relabelings: [] + # - sourceLabels: [__meta_kubernetes_pod_node_name] + # separator: ; + # regex: ^(.*)$ + # targetLabel: nodename + # replacement: $1 + # action: replace + + ## Settings affecting alertmanagerSpec + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#alertmanagerspec + ## + alertmanagerSpec: + ## Standard object's metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata + ## Metadata Labels and Annotations gets propagated to the Alertmanager pods. + ## + podMetadata: {} + + ## Image of Alertmanager + ## + image: + repository: rancher/mirrored-prometheus-alertmanager + tag: v0.27.0 + sha: "" + + ## If true then the user will be responsible to provide a secret with alertmanager configuration + ## So when true the config part will be ignored (including templateFiles) and the one in the secret will be used + ## + useExistingSecret: false + + ## Secrets is a list of Secrets in the same namespace as the Alertmanager object, which shall be mounted into the + ## Alertmanager Pods. The Secrets are mounted into /etc/alertmanager/secrets/. + ## + secrets: [] + + ## If false then the user will opt out of automounting API credentials. + ## + automountServiceAccountToken: true + + ## ConfigMaps is a list of ConfigMaps in the same namespace as the Alertmanager object, which shall be mounted into the Alertmanager Pods. + ## The ConfigMaps are mounted into /etc/alertmanager/configmaps/. + ## + configMaps: [] + + ## ConfigSecret is the name of a Kubernetes Secret in the same namespace as the Alertmanager object, which contains configuration for + ## this Alertmanager instance. Defaults to 'alertmanager-' The secret is mounted into /etc/alertmanager/config. + ## + # configSecret: + + ## WebTLSConfig defines the TLS parameters for HTTPS + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#alertmanagerwebspec + web: {} + + ## AlertmanagerConfigs to be selected to merge and configure Alertmanager with. + ## + alertmanagerConfigSelector: + # default ignores resources created by Rancher Monitoring + matchExpressions: + - key: release + operator: NotIn + values: + - rancher-monitoring + + ## AlermanagerConfig to be used as top level configuration + ## + alertmanagerConfiguration: {} + ## Example with select a global alertmanagerconfig + # alertmanagerConfiguration: + # name: global-alertmanager-Configuration + + ## Defines the strategy used by AlertmanagerConfig objects to match alerts. eg: + ## + alertmanagerConfigMatcherStrategy: {} + ## Example with use OnNamespace strategy + # alertmanagerConfigMatcherStrategy: + # type: OnNamespace + + ## Define Log Format + # Use logfmt (default) or json logging + logFormat: logfmt + + ## Log level for Alertmanager to be configured with. + ## + logLevel: info + + ## Size is the expected size of the alertmanager cluster. The controller will eventually make the size of the + ## running cluster equal to the expected size. + replicas: 1 + + ## Time duration Alertmanager shall retain data for. Default is '120h', and must match the regular expression + ## [0-9]+(ms|s|m|h) (milliseconds seconds minutes hours). + ## + retention: 120h + + ## Storage is the definition of how storage will be used by the Alertmanager instances. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md + ## + storage: {} + # volumeClaimTemplate: + # spec: + # storageClassName: gluster + # accessModes: ["ReadWriteOnce"] + # resources: + # requests: + # storage: 50Gi + # selector: {} + + + ## The external URL the Alertmanager instances will be available under. This is necessary to generate correct URLs. This is necessary if Alertmanager is not served from root of a DNS name. string false + ## + externalUrl: + + ## The route prefix Alertmanager registers HTTP handlers for. This is useful, if using ExternalURL and a proxy is rewriting HTTP routes of a request, and the actual ExternalURL is still true, + ## but the server serves requests under a different route prefix. For example for use with kubectl proxy. + ## + routePrefix: / + + ## scheme: HTTP scheme to use. Can be used with `tlsConfig` for example if using istio mTLS. + scheme: "" + + ## tlsConfig: TLS configuration to use when connect to the endpoint. For example if using istio mTLS. + ## Of type: https://github.com/coreos/prometheus-operator/blob/main/Documentation/api.md#tlsconfig + tlsConfig: {} + + ## If set to true all actions on the underlying managed objects are not going to be performed, except for delete actions. + ## + paused: false + + ## Define which Nodes the Pods are scheduled on. + ## ref: https://kubernetes.io/docs/user-guide/node-selection/ + ## + nodeSelector: {} + + ## Define resources requests and limits for single Pods. + ## ref: https://kubernetes.io/docs/user-guide/compute-resources/ + ## + resources: + limits: + memory: 500Mi + cpu: 1000m + requests: + memory: 100Mi + cpu: 100m + + ## Pod anti-affinity can prevent the scheduler from placing Prometheus replicas on the same node. + ## The default value "soft" means that the scheduler should *prefer* to not schedule two replica pods onto the same node but no guarantee is provided. + ## The value "hard" means that the scheduler is *required* to not schedule two replica pods onto the same node. + ## The value "" will disable pod anti-affinity so that no anti-affinity rules will be configured. + ## + podAntiAffinity: "" + + ## If anti-affinity is enabled sets the topologyKey to use for anti-affinity. + ## This can be changed to, for example, failure-domain.beta.kubernetes.io/zone + ## + podAntiAffinityTopologyKey: kubernetes.io/hostname + + ## Assign custom affinity rules to the alertmanager instance + ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ + ## + affinity: {} + # nodeAffinity: + # requiredDuringSchedulingIgnoredDuringExecution: + # nodeSelectorTerms: + # - matchExpressions: + # - key: kubernetes.io/e2e-az-name + # operator: In + # values: + # - e2e-az1 + # - e2e-az2 + + ## If specified, the pod's tolerations. + ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ + ## + tolerations: [] + # - key: "key" + # operator: "Equal" + # value: "value" + # effect: "NoSchedule" + + ## If specified, the pod's topology spread constraints. + ## ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ + ## + topologySpreadConstraints: [] + # - maxSkew: 1 + # topologyKey: topology.kubernetes.io/zone + # whenUnsatisfiable: DoNotSchedule + # labelSelector: + # matchLabels: + # app: alertmanager + + ## SecurityContext holds pod-level security attributes and common container settings. + ## This defaults to non root user with uid 1000 and gid 2000. *v1.PodSecurityContext false + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ + ## + securityContext: + runAsGroup: 2000 + runAsNonRoot: true + runAsUser: 1000 + fsGroup: 2000 + seccompProfile: + type: RuntimeDefault + + ## ListenLocal makes the Alertmanager server listen on loopback, so that it does not bind against the Pod IP. + ## Note this is only for the Alertmanager UI, not the gossip communication. + ## + listenLocal: false + + ## Containers allows injecting additional containers. This is meant to allow adding an authentication proxy to an Alertmanager pod. + ## + containers: [] + # containers: + # - name: oauth-proxy + # image: quay.io/oauth2-proxy/oauth2-proxy:v7.5.1 + # args: + # - --upstream=http://127.0.0.1:9093 + # - --http-address=0.0.0.0:8081 + # - --metrics-address=0.0.0.0:8082 + # - ... + # ports: + # - containerPort: 8081 + # name: oauth-proxy + # protocol: TCP + # - containerPort: 8082 + # name: oauth-metrics + # protocol: TCP + # resources: {} + + # Additional volumes on the output StatefulSet definition. + volumes: [] + + # Additional VolumeMounts on the output StatefulSet definition. + volumeMounts: [] + + ## InitContainers allows injecting additional initContainers. This is meant to allow doing some changes + ## (permissions, dir tree) on mounted volumes before starting prometheus + initContainers: [] + + ## Priority class assigned to the Pods + ## + priorityClassName: "" + + ## AdditionalPeers allows injecting a set of additional Alertmanagers to peer with to form a highly available cluster. + ## + additionalPeers: [] + + ## PortName to use for Alert Manager. + ## + portName: "http-web" + + ## ClusterAdvertiseAddress is the explicit address to advertise in cluster. Needs to be provided for non RFC1918 [1] (public) addresses. [1] RFC1918: https://tools.ietf.org/html/rfc1918 + ## + clusterAdvertiseAddress: false + + ## clusterGossipInterval determines interval between gossip attempts. + ## Needs to be specified as GoDuration, a time duration that can be parsed by Go’s time.ParseDuration() (e.g. 45ms, 30s, 1m, 1h20m15s) + clusterGossipInterval: "" + + ## clusterPeerTimeout determines timeout for cluster peering. + ## Needs to be specified as GoDuration, a time duration that can be parsed by Go’s time.ParseDuration() (e.g. 45ms, 30s, 1m, 1h20m15s) + clusterPeerTimeout: "" + + ## clusterPushpullInterval determines interval between pushpull attempts. + ## Needs to be specified as GoDuration, a time duration that can be parsed by Go’s time.ParseDuration() (e.g. 45ms, 30s, 1m, 1h20m15s) + clusterPushpullInterval: "" + + ## ForceEnableClusterMode ensures Alertmanager does not deactivate the cluster mode when running with a single replica. + ## Use case is e.g. spanning an Alertmanager cluster across Kubernetes clusters with a single replica in each. + forceEnableClusterMode: false + + ## Minimum number of seconds for which a newly created pod should be ready without any of its container crashing for it to + ## be considered available. Defaults to 0 (pod will be considered available as soon as it is ready). + minReadySeconds: 0 + + ## Additional configuration which is not covered by the properties above. (passed through tpl) + additionalConfig: {} + + ## Additional configuration which is not covered by the properties above. + ## Useful, if you need advanced templating inside alertmanagerSpec. + ## Otherwise, use alertmanager.alertmanagerSpec.additionalConfig (passed through tpl) + additionalConfigString: "" + + ## ExtraSecret can be used to store various data in an extra secret + ## (use it for example to store hashed basic auth credentials) + extraSecret: + ## if not set, name will be auto generated + # name: "" + annotations: {} + data: {} + # auth: | + # foo:$apr1$OFG3Xybp$ckL0FHDAkoXYIlH9.cysT0 + # someoneelse:$apr1$DMZX2Z4q$6SbQIfyuLQd.xmo/P0m2c. + +## Using default values from https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml +## +grafana: + enabled: true + namespaceOverride: "" + + ## Grafana's primary configuration + ## NOTE: values in map will be converted to ini format + ## ref: http://docs.grafana.org/installation/configuration/ + ## + grafana.ini: + users: + auto_assign_org_role: Viewer + auth: + disable_login_form: false + auth.anonymous: + enabled: true + org_role: Viewer + auth.basic: + enabled: false + dashboards: + # Modify this value to change the default dashboard shown on the main Grafana page + default_home_dashboard_path: /tmp/dashboards/rancher-default-home.json + security: + # Required to embed dashboards in Rancher Cluster Overview Dashboard on Cluster Explorer + allow_embedding: true + + deploymentStrategy: + type: Recreate + + ## ForceDeployDatasources Create datasource configmap even if grafana deployment has been disabled + ## + forceDeployDatasources: false + + ## ForceDeployDashboard Create dashboard configmap even if grafana deployment has been disabled + ## + forceDeployDashboards: false + + ## Deploy default dashboards + ## + defaultDashboardsEnabled: true + + # Additional options for defaultDashboards + defaultDashboards: + # The default namespace to place defaultDashboards within + namespace: cattle-dashboards + # Whether to create the default namespace as a Helm managed namespace or use an existing namespace + # If false, the defaultDashboards.namespace will be created as a Helm managed namespace + useExistingNamespace: false + # Whether the Helm managed namespace created by this chart should be left behind on a Helm uninstall + # If you place other dashboards in this namespace, then they will be deleted on a helm uninstall + # Ignore if useExistingNamespace is true + cleanupOnUninstall: false + + ## Timezone for the default dashboards + ## Other options are: browser or a specific timezone, i.e. Europe/Luxembourg + ## + defaultDashboardsTimezone: utc + + ## Editable flag for the default dashboards + ## + defaultDashboardsEditable: true + + adminPassword: prom-operator + + ingress: + ## If true, Grafana Ingress will be created + ## + enabled: false + + ## IngressClassName for Grafana Ingress. + ## Should be provided if Ingress is enable. + ## + # ingressClassName: nginx + + ## Annotations for Grafana Ingress + ## + annotations: {} + # kubernetes.io/ingress.class: nginx + # kubernetes.io/tls-acme: "true" + + ## Labels to be added to the Ingress + ## + labels: {} + + ## Hostnames. + ## Must be provided if Ingress is enable. + ## + # hosts: + # - grafana.domain.com + hosts: [] + + ## Path for grafana ingress + path: / + + ## TLS configuration for grafana Ingress + ## Secret must be manually created in the namespace + ## + tls: [] + # - secretName: grafana-general-tls + # hosts: + # - grafana.example.com + + # # To make Grafana persistent (Using Statefulset) + # # + # persistence: + # enabled: true + # type: sts + # storageClassName: "storageClassName" + # accessModes: + # - ReadWriteOnce + # size: 20Gi + # finalizers: + # - kubernetes.io/pvc-protection + + serviceAccount: + create: true + autoMount: true + + sidecar: + dashboards: + enabled: true + label: grafana_dashboard + searchNamespace: cattle-dashboards + labelValue: "1" + + # Support for new table panels, when enabled grafana auto migrates the old table panels to newer table panels + enableNewTablePanelSyntax: false + + ## Annotations for Grafana dashboard configmaps + ## + annotations: {} + multicluster: + global: + enabled: false + etcd: + enabled: false + provider: + allowUiUpdates: false + datasources: + enabled: true + defaultDatasourceEnabled: true + isDefaultDatasource: true + + uid: prometheus + + ## URL of prometheus datasource + ## + # url: http://prometheus-stack-prometheus:9090/ + + ## Prometheus request timeout in seconds + # timeout: 30 + + # If not defined, will use prometheus.prometheusSpec.scrapeInterval or its default + # defaultDatasourceScrapeInterval: 15s + + ## Annotations for Grafana datasource configmaps + ## + annotations: {} + + ## Set method for HTTP to send query to datasource + httpMethod: POST + + ## Create datasource for each Pod of Prometheus StatefulSet; + ## this uses headless service `prometheus-operated` which is + ## created by Prometheus Operator + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/0fee93e12dc7c2ea1218f19ae25ec6b893460590/pkg/prometheus/statefulset.go#L255-L286 + createPrometheusReplicasDatasources: false + label: grafana_datasource + labelValue: "1" + + ## Field with internal link pointing to existing data source in Grafana. + ## Can be provisioned via additionalDataSources + exemplarTraceIdDestinations: {} + # datasourceUid: Jaeger + # traceIdLabelName: trace_id + alertmanager: + enabled: true + uid: alertmanager + handleGrafanaManagedAlerts: false + implementation: prometheus + + extraConfigmapMounts: [] + # - name: certs-configmap + # mountPath: /etc/grafana/ssl/ + # configMap: certs-configmap + # readOnly: true + + deleteDatasources: [] + # - name: example-datasource + # orgId: 1 + + ## Configure additional grafana datasources (passed through tpl) + ## ref: http://docs.grafana.org/administration/provisioning/#datasources + additionalDataSources: [] + # - name: prometheus-sample + # access: proxy + # basicAuth: true + # basicAuthPassword: pass + # basicAuthUser: daco + # editable: false + # jsonData: + # tlsSkipVerify: true + # orgId: 1 + # type: prometheus + # url: https://{{ printf "%s-prometheus.svc" .Release.Name }}:9090 + # version: 1 + + ## Passed to grafana subchart and used by servicemonitor below + ## + service: + portName: nginx-http + ## Port for Grafana Service to listen on + ## + port: 80 + ## To be used with a proxy extraContainer port + ## + targetPort: 8080 + ## Port to expose on each node + ## Only used if service.type is 'NodePort' + ## + nodePort: 30950 + ## Service type + ## + type: ClusterIP + + proxy: + image: + repository: rancher/mirrored-library-nginx + tag: 1.24.0-alpine + + ## Enable an Specify container in extraContainers. This is meant to allow adding an authentication proxy to a grafana pod + extraContainers: | + - name: grafana-proxy + args: + - nginx + - -g + - daemon off; + - -c + - /nginx/nginx.conf + image: "{{ template "system_default_registry" . }}{{ .Values.proxy.image.repository }}:{{ .Values.proxy.image.tag }}" + ports: + - containerPort: 8080 + name: nginx-http + protocol: TCP + volumeMounts: + - mountPath: /nginx + name: grafana-nginx + - mountPath: /var/cache/nginx + name: nginx-home + securityContext: + runAsUser: 101 + runAsGroup: 101 + + ## Volumes that can be used in containers + extraContainerVolumes: + - name: nginx-home + emptyDir: {} + - name: grafana-nginx + configMap: + name: grafana-nginx-proxy-config + items: + - key: nginx.conf + mode: 438 + path: nginx.conf + + ## If true, create a serviceMonitor for grafana + ## + serviceMonitor: + # If true, a ServiceMonitor CRD is created for a prometheus operator + # https://github.com/coreos/prometheus-operator + # + enabled: true + + # Path to use for scraping metrics. Might be different if server.root_url is set + # in grafana.ini + path: "/metrics" + + # namespace: monitoring (defaults to use the namespace this chart is deployed to) + + # labels for the ServiceMonitor + labels: {} + + # Scrape interval. If not set, the Prometheus default scrape interval is used. + # + interval: "" + scheme: http + tlsConfig: {} + scrapeTimeout: 30s + + ## RelabelConfigs to apply to samples before scraping + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig + ## + relabelings: [] + # - sourceLabels: [__meta_kubernetes_pod_node_name] + # separator: ; + # regex: ^(.*)$ + # targetLabel: nodename + # replacement: $1 + # action: replace + + resources: + limits: + memory: 200Mi + cpu: 200m + requests: + memory: 100Mi + cpu: 100m + + testFramework: + enabled: false + +## Deploy a Prometheus instance +## +prometheus: + + enabled: true + + ## Annotations for Prometheus + ## + annotations: {} + + ## Service account for Prometheuses to use. + ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/ + ## + serviceAccount: + create: true + name: "" + annotations: {} + + ## Configuration for Prometheus service + ## + service: + annotations: {} + labels: {} + clusterIP: "" + + ## Port for Prometheus Service to listen on + ## + port: 9090 + + ## To be used with a proxy extraContainer port + targetPort: 8081 + + ## Port for Prometheus Reloader to listen on + ## + reloaderWebPort: 8080 + + ## List of IP addresses at which the Prometheus server service is available + ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips + ## + externalIPs: [] + + ## Port to expose on each node + ## Only used if service.type is 'NodePort' + ## + nodePort: 30090 + + ## Loadbalancer IP + ## Only use if service.type is "LoadBalancer" + loadBalancerIP: "" + loadBalancerSourceRanges: [] + + ## Denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints + ## + externalTrafficPolicy: Cluster + + ## Service type + ## + type: ClusterIP + + ## Additional ports to open for Prometheus service + ## + additionalPorts: [] + # additionalPorts: + # - name: oauth-proxy + # port: 8081 + # targetPort: 8081 + # - name: oauth-metrics + # port: 8082 + # targetPort: 8082 + + ## Consider that all endpoints are considered "ready" even if the Pods themselves are not + ## Ref: https://kubernetes.io/docs/reference/kubernetes-api/service-resources/service-v1/#ServiceSpec + publishNotReadyAddresses: false + + sessionAffinity: "" + + ## Configure pod disruption budgets for Prometheus + ## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/#specifying-a-poddisruptionbudget + ## This configuration is immutable once created and will require the PDB to be deleted to be changed + ## https://github.com/kubernetes/kubernetes/issues/45398 + ## + podDisruptionBudget: + enabled: false + minAvailable: 1 + maxUnavailable: "" + + ## ExtraSecret can be used to store various data in an extra secret + ## (use it for example to store hashed basic auth credentials) + extraSecret: + ## if not set, name will be auto generated + # name: "" + annotations: {} + data: {} + # auth: | + # foo:$apr1$OFG3Xybp$ckL0FHDAkoXYIlH9.cysT0 + # someoneelse:$apr1$DMZX2Z4q$6SbQIfyuLQd.xmo/P0m2c. + + ingress: + enabled: false + + # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName + # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress + # ingressClassName: nginx + + annotations: {} + labels: {} + + ## Redirect ingress to an additional defined port on the service + # servicePort: 8081 + + ## Hostnames. + ## Must be provided if Ingress is enabled. + ## + # hosts: + # - prometheus.domain.com + hosts: [] + + ## Paths to use for ingress rules - one path should match the prometheusSpec.routePrefix + ## + paths: [] + # - / + + ## For Kubernetes >= 1.18 you should specify the pathType (determines how Ingress paths should be matched) + ## See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#better-path-matching-with-path-types + # pathType: ImplementationSpecific + + ## TLS configuration for Prometheus Ingress + ## Secret must be manually created in the namespace + ## + tls: [] + # - secretName: prometheus-general-tls + # hosts: + # - prometheus.example.com + + ## Configure additional options for default pod security policy for Prometheus + ## ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/ + podSecurityPolicy: + allowedCapabilities: [] + volumes: [] + + serviceMonitor: + ## Scrape interval. If not set, the Prometheus default scrape interval is used. + ## + interval: "" + selfMonitor: true + + ## scheme: HTTP scheme to use for scraping. Can be used with `tlsConfig` for example if using istio mTLS. + scheme: "" + + ## tlsConfig: TLS configuration to use when scraping the endpoint. For example if using istio mTLS. + ## Of type: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#tlsconfig + tlsConfig: {} + + bearerTokenFile: + + ## Metric relabel configs to apply to samples before ingestion. + ## + metricRelabelings: [] + # - action: keep + # regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+' + # sourceLabels: [__name__] + + # relabel configs to apply to samples before ingestion. + ## + relabelings: [] + # - sourceLabels: [__meta_kubernetes_pod_node_name] + # separator: ; + # regex: ^(.*)$ + # targetLabel: nodename + # replacement: $1 + # action: replace + + ## Settings affecting prometheusSpec + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#prometheusspec + ## + prometheusSpec: + ## If true, pass --storage.tsdb.max-block-duration=2h to prometheus. This is already done if using Thanos + ## + disableCompaction: false + ## APIServerConfig + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#apiserverconfig + ## + apiserverConfig: {} + + ## Interval between consecutive scrapes. + ## Defaults to 30s. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/release-0.44/pkg/prometheus/promcfg.go#L180-L183 + ## + scrapeInterval: "30s" + + ## Number of seconds to wait for target to respond before erroring + ## + scrapeTimeout: "" + + ## Interval between consecutive evaluations. + ## + evaluationInterval: "1m" + + ## ListenLocal makes the Prometheus server listen on loopback, so that it does not bind against the Pod IP. + ## + listenLocal: false + + ## EnableAdminAPI enables Prometheus the administrative HTTP API which includes functionality such as deleting time series. + ## This is disabled by default. + ## ref: https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-admin-apis + ## + enableAdminAPI: false + + ## WebTLSConfig defines the TLS parameters for HTTPS + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#webtlsconfig + web: {} + + ## Exemplars related settings that are runtime reloadable. + ## It requires to enable the exemplar storage feature to be effective. + exemplars: "" + ## Maximum number of exemplars stored in memory for all series. + ## If not set, Prometheus uses its default value. + ## A value of zero or less than zero disables the storage. + # maxSize: 100000 + + # EnableFeatures API enables access to Prometheus disabled features. + # ref: https://prometheus.io/docs/prometheus/latest/disabled_features/ + enableFeatures: [] + # - exemplar-storage + + ## Image of Prometheus. + ## + image: + repository: rancher/mirrored-prometheus-prometheus + tag: v2.50.1 + sha: "" + + ## Tolerations for use with node taints + ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/ + ## + tolerations: [] + # - key: "key" + # operator: "Equal" + # value: "value" + # effect: "NoSchedule" + + ## If specified, the pod's topology spread constraints. + ## ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ + ## + topologySpreadConstraints: [] + # - maxSkew: 1 + # topologyKey: topology.kubernetes.io/zone + # whenUnsatisfiable: DoNotSchedule + # labelSelector: + # matchLabels: + # app: prometheus + + ## Alertmanagers to which alerts will be sent + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#alertmanagerendpoints + ## + ## Default configuration will connect to the alertmanager deployed as part of this release + ## + alertingEndpoints: [] + # - name: "" + # namespace: "" + # port: http + # scheme: http + # pathPrefix: "" + # tlsConfig: {} + # bearerTokenFile: "" + # apiVersion: v2 + + ## External labels to add to any time series or alerts when communicating with external systems + ## + externalLabels: {} + + ## enable --web.enable-remote-write-receiver flag on prometheus-server + ## + enableRemoteWriteReceiver: false + + ## Name of the external label used to denote Prometheus instance name + ## + prometheusExternalLabelName: "" + + ## If true, the Operator won't add the external label used to denote Prometheus instance name + ## + prometheusExternalLabelNameClear: false + + ## External URL at which Prometheus will be reachable. + ## + externalUrl: "" + + ## Define which Nodes the Pods are scheduled on. + ## ref: https://kubernetes.io/docs/user-guide/node-selection/ + ## + nodeSelector: {} + + ## Secrets is a list of Secrets in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. + ## The Secrets are mounted into /etc/prometheus/secrets/. Secrets changes after initial creation of a Prometheus object are not + ## reflected in the running Pods. To change the secrets mounted into the Prometheus Pods, the object must be deleted and recreated + ## with the new list of secrets. + ## + secrets: [] + + ## ConfigMaps is a list of ConfigMaps in the same namespace as the Prometheus object, which shall be mounted into the Prometheus Pods. + ## The ConfigMaps are mounted into /etc/prometheus/configmaps/. + ## + configMaps: [] + + ## QuerySpec defines the query command line flags when starting Prometheus. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#queryspec + ## + query: {} + + ## If true, a nil or {} value for prometheus.prometheusSpec.ruleSelector will cause the + ## prometheus resource to be created with selectors based on values in the helm deployment, + ## which will also match the PrometheusRule resources created + ## + ruleSelectorNilUsesHelmValues: false + + ## PrometheusRules to be selected for target discovery. + ## If {}, select all PrometheusRules + ## + ruleSelector: + # default ignores resources created by Rancher Monitoring + matchExpressions: + - key: release + operator: NotIn + values: + - rancher-monitoring + + ## If true, a nil or {} value for prometheus.prometheusSpec.serviceMonitorSelector will cause the + ## prometheus resource to be created with selectors based on values in the helm deployment, + ## which will also match the servicemonitors created + ## + serviceMonitorSelectorNilUsesHelmValues: false + + ## ServiceMonitors to be selected for target discovery. + ## If {}, select all ServiceMonitors + ## + serviceMonitorSelector: + # default ignores resources created by Rancher Monitoring + matchExpressions: + - key: release + operator: NotIn + values: + - rancher-monitoring + + ## If true, a nil or {} value for prometheus.prometheusSpec.podMonitorSelector will cause the + ## prometheus resource to be created with selectors based on values in the helm deployment, + ## which will also match the podmonitors created + ## + podMonitorSelectorNilUsesHelmValues: false + + ## PodMonitors to be selected for target discovery. + ## If {}, select all PodMonitors + ## + podMonitorSelector: + # default ignores resources created by Rancher Monitoring + matchExpressions: + - key: release + operator: NotIn + values: + - rancher-monitoring + + ## If true, a nil or {} value for prometheus.prometheusSpec.probeSelector will cause the + ## prometheus resource to be created with selectors based on values in the helm deployment, + ## which will also match the probes created + ## + probeSelectorNilUsesHelmValues: true + + ## Probes to be selected for target discovery. + ## If {}, select all Probes + ## + probeSelector: + # default ignores resources created by Rancher Monitoring + matchExpressions: + - key: release + operator: NotIn + values: + - rancher-monitoring + + ## How long to retain metrics + ## + retention: 10d + + ## Maximum size of metrics + ## + retentionSize: "50GiB" + + ## Enable compression of the write-ahead log using Snappy. + ## + walCompression: true + + ## If true, the Operator won't process any Prometheus configuration changes + ## + paused: false + + ## Number of replicas of each shard to deploy for a Prometheus deployment. + ## Number of replicas multiplied by shards is the total number of Pods created. + ## + replicas: 1 + + ## EXPERIMENTAL: Number of shards to distribute targets onto. + ## Number of replicas multiplied by shards is the total number of Pods created. + ## Note that scaling down shards will not reshard data onto remaining instances, it must be manually moved. + ## Increasing shards will not reshard data either but it will continue to be available from the same instances. + ## To query globally use Thanos sidecar and Thanos querier or remote write data to a central location. + ## Sharding is done on the content of the `__address__` target meta-label. + ## + shards: 1 + + ## Log level for Prometheus be configured in + ## + logLevel: info + + ## Log format for Prometheus be configured in + ## + logFormat: logfmt + + ## Prefix used to register routes, overriding externalUrl route. + ## Useful for proxies that rewrite URLs. + ## + routePrefix: / + + ## Standard object's metadata. More info: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#metadata + ## Metadata Labels and Annotations gets propagated to the prometheus pods. + ## + podMetadata: {} + # labels: + # app: prometheus + # k8s-app: prometheus + + ## Pod anti-affinity can prevent the scheduler from placing Prometheus replicas on the same node. + ## The default value "soft" means that the scheduler should *prefer* to not schedule two replica pods onto the same node but no guarantee is provided. + ## The value "hard" means that the scheduler is *required* to not schedule two replica pods onto the same node. + ## The value "" will disable pod anti-affinity so that no anti-affinity rules will be configured. + podAntiAffinity: "" + + ## If anti-affinity is enabled sets the topologyKey to use for anti-affinity. + ## This can be changed to, for example, failure-domain.beta.kubernetes.io/zone + ## + podAntiAffinityTopologyKey: kubernetes.io/hostname + + ## Assign custom affinity rules to the prometheus instance + ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/ + ## + affinity: {} + # nodeAffinity: + # requiredDuringSchedulingIgnoredDuringExecution: + # nodeSelectorTerms: + # - matchExpressions: + # - key: kubernetes.io/e2e-az-name + # operator: In + # values: + # - e2e-az1 + # - e2e-az2 + + ## The remote_write spec configuration for Prometheus. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#remotewritespec + remoteWrite: [] + # - url: http://remote1/push + ## additionalRemoteWrite is appended to remoteWrite + additionalRemoteWrite: [] + + ## Enable/Disable Grafana dashboards provisioning for prometheus remote write feature + remoteWriteDashboards: false + + ## Resource limits & requests + ## + resources: + limits: + memory: 3000Mi + cpu: 1000m + requests: + memory: 750Mi + cpu: 750m + + storage: + enabled: false + ## Prometheus StorageSpec for persistent data + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/storage.md + ## + storageSpec: {} + ## Using PersistentVolumeClaim + ## + # volumeClaimTemplate: + # spec: + # storageClassName: gluster + # accessModes: ["ReadWriteOnce"] + # resources: + # requests: + # storage: 50Gi + # selector: {} + + ## Using tmpfs volume + ## + # emptyDir: + # medium: Memory + + # Additional volumes on the output StatefulSet definition. + volumes: + - name: nginx-home + emptyDir: {} + - name: prometheus-nginx + configMap: + name: prometheus-nginx-proxy-config + defaultMode: 438 + + # Additional VolumeMounts on the output StatefulSet definition. + volumeMounts: [] + + ## AdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations + ## are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form + ## as specified in the official Prometheus documentation: + ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are + ## appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility + ## to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible + ## scrape configs are going to break Prometheus after the upgrade. + ## AdditionalScrapeConfigs can be defined as a list or as a templated string. + ## + ## The scrape configuration example below will find master nodes, provided they have the name .*mst.*, relabel the + ## port to 2379 and allow etcd scraping provided it is running on all Kubernetes master nodes + ## + additionalScrapeConfigs: [] + # - job_name: kube-etcd + # kubernetes_sd_configs: + # - role: node + # scheme: https + # tls_config: + # ca_file: /etc/prometheus/secrets/etcd-client-cert/etcd-ca + # cert_file: /etc/prometheus/secrets/etcd-client-cert/etcd-client + # key_file: /etc/prometheus/secrets/etcd-client-cert/etcd-client-key + # relabel_configs: + # - action: labelmap + # regex: __meta_kubernetes_node_label_(.+) + # - source_labels: [__address__] + # action: replace + # targetLabel: __address__ + # regex: ([^:;]+):(\d+) + # replacement: ${1}:2379 + # - source_labels: [__meta_kubernetes_node_name] + # action: keep + # regex: .*mst.* + # - source_labels: [__meta_kubernetes_node_name] + # action: replace + # targetLabel: node + # regex: (.*) + # replacement: ${1} + # metric_relabel_configs: + # - regex: (kubernetes_io_hostname|failure_domain_beta_kubernetes_io_region|beta_kubernetes_io_os|beta_kubernetes_io_arch|beta_kubernetes_io_instance_type|failure_domain_beta_kubernetes_io_zone) + # action: labeldrop + # + ## If scrape config contains a repetitive section, you may want to use a template. + ## In the following example, you can see how to define `gce_sd_configs` for multiple zones + # additionalScrapeConfigs: | + # - job_name: "node-exporter" + # gce_sd_configs: + # {{range $zone := .Values.gcp_zones}} + # - project: "project1" + # zone: "{{$zone}}" + # port: 9100 + # {{end}} + # relabel_configs: + # ... + + + ## If additional scrape configurations are already deployed in a single secret file you can use this section. + ## Expected values are the secret name and key + ## Cannot be used with additionalScrapeConfigs + additionalScrapeConfigsSecret: {} + # enabled: false + # name: + # key: + + ## additionalPrometheusSecretsAnnotations allows to add annotations to the kubernetes secret. This can be useful + ## when deploying via spinnaker to disable versioning on the secret, strategy.spinnaker.io/versioned: 'false' + additionalPrometheusSecretsAnnotations: {} + + ## AdditionalAlertManagerConfigs allows for manual configuration of alertmanager jobs in the form as specified + ## in the official Prometheus documentation https://prometheus.io/docs/prometheus/latest/configuration/configuration/#. + ## AlertManager configurations specified are appended to the configurations generated by the Prometheus Operator. + ## As AlertManager configs are appended, the user is responsible to make sure it is valid. Note that using this + ## feature may expose the possibility to break upgrades of Prometheus. It is advised to review Prometheus release + ## notes to ensure that no incompatible AlertManager configs are going to break Prometheus after the upgrade. + ## + additionalAlertManagerConfigs: [] + # - consul_sd_configs: + # - server: consul.dev.test:8500 + # scheme: http + # datacenter: dev + # tag_separator: ',' + # services: + # - metrics-prometheus-alertmanager + + ## If additional alertmanager configurations are already deployed in a single secret, or you want to manage + ## them separately from the helm deployment, you can use this section. + ## Expected values are the secret name and key + ## Cannot be used with additionalAlertManagerConfigs + additionalAlertManagerConfigsSecret: {} + # name: + # key: + # optional: false + + ## AdditionalAlertRelabelConfigs allows specifying Prometheus alert relabel configurations. Alert relabel configurations specified are appended + ## to the configurations generated by the Prometheus Operator. Alert relabel configurations specified must have the form as specified in the + ## official Prometheus documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alert_relabel_configs. + ## As alert relabel configs are appended, the user is responsible to make sure it is valid. Note that using this feature may expose the + ## possibility to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible alert relabel + ## configs are going to break Prometheus after the upgrade. + ## + additionalAlertRelabelConfigs: [] + # - separator: ; + # regex: prometheus_replica + # replacement: $1 + # action: labeldrop + + ## If additional alert relabel configurations are already deployed in a single secret, or you want to manage + ## them separately from the helm deployment, you can use this section. + ## Expected values are the secret name and key + ## Cannot be used with additionalAlertRelabelConfigs + additionalAlertRelabelConfigsSecret: {} + # name: + # key: + + ## SecurityContext holds pod-level security attributes and common container settings. + ## This defaults to non root user with uid 1000 and gid 2000. + ## https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md + ## + securityContext: + runAsGroup: 2000 + runAsNonRoot: true + runAsUser: 1000 + fsGroup: 2000 + seccompProfile: + type: RuntimeDefault + + ## Priority class assigned to the Pods + ## + priorityClassName: "" + + ## Thanos configuration allows configuring various aspects of a Prometheus server in a Thanos environment. + ## This section is experimental, it may change significantly without deprecation notice in any release. + ## This is experimental and may change significantly without backward compatibility in any release. + ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#thanosspec + ## + thanos: {} + # secretProviderClass: + # provider: gcp + # parameters: + # secrets: | + # - resourceName: "projects/$PROJECT_ID/secrets/testsecret/versions/latest" + # fileName: "objstore.yaml" + ## ObjectStorageConfig configures object storage in Thanos. + # objectStorageConfig: + # # use existing secret, if configured, objectStorageConfig.secret will not be used + # existingSecret: {} + # # name: "" + # # key: "" + # # will render objectStorageConfig secret data and configure it to be used by Thanos custom resource, + # # ignored when prometheusspec.thanos.objectStorageConfig.existingSecret is set + # # https://thanos.io/tip/thanos/storage.md/#s3 + # secret: {} + # # type: S3 + # # config: + # # bucket: "" + # # endpoint: "" + # # region: "" + # # access_key: "" + # # secret_key: "" + + proxy: + image: + repository: rancher/mirrored-library-nginx + tag: 1.24.0-alpine + + ## Containers allows injecting additional containers. This is meant to allow adding an authentication proxy to a Prometheus pod. + ## if using proxy extraContainer update targetPort with proxy container port + containers: | + - name: prometheus-proxy + args: + - nginx + - -g + - daemon off; + - -c + - /nginx/nginx.conf + image: "{{ template "system_default_registry" . }}{{ .Values.prometheus.prometheusSpec.proxy.image.repository }}:{{ .Values.prometheus.prometheusSpec.proxy.image.tag }}" + ports: + - containerPort: 8081 + name: nginx-http + protocol: TCP + volumeMounts: + - mountPath: /nginx + name: prometheus-nginx + - mountPath: /var/cache/nginx + name: nginx-home + securityContext: + runAsUser: 101 + runAsGroup: 101 + + ## InitContainers allows injecting additional initContainers. This is meant to allow doing some changes + ## (permissions, dir tree) on mounted volumes before starting prometheus + initContainers: [] + + ## PortName to use for Prometheus. + ## + portName: "http-web" + + ## ArbitraryFSAccessThroughSMs configures whether configuration based on a service monitor can access arbitrary files + ## on the file system of the Prometheus container e.g. bearer token files. + arbitraryFSAccessThroughSMs: false + + ## OverrideHonorLabels if set to true overrides all user configured honor_labels. If HonorLabels is set in ServiceMonitor + ## or PodMonitor to true, this overrides honor_labels to false. + overrideHonorLabels: false + + ## OverrideHonorTimestamps allows to globally enforce honoring timestamps in all scrape configs. + overrideHonorTimestamps: false + + ## When ignoreNamespaceSelectors is set to true, namespaceSelector from all PodMonitor, ServiceMonitor and Probe objects will be ignored, + ## they will only discover targets within the namespace of the PodMonitor, ServiceMonitor and Probe object, + ## and servicemonitors will be installed in the default service namespace. + ## Defaults to false. + ignoreNamespaceSelectors: false + + ## EnforcedNamespaceLabel enforces adding a namespace label of origin for each alert and metric that is user created. + ## The label value will always be the namespace of the object that is being created. + ## Disabled by default + enforcedNamespaceLabel: "" + + ## PrometheusRulesExcludedFromEnforce - list of prometheus rules to be excluded from enforcing of adding namespace labels. + ## Works only if enforcedNamespaceLabel set to true. Make sure both ruleNamespace and ruleName are set for each pair + ## Deprecated, use `excludedFromEnforcement` instead + prometheusRulesExcludedFromEnforce: [] + + ## ExcludedFromEnforcement - list of object references to PodMonitor, ServiceMonitor, Probe and PrometheusRule objects + ## to be excluded from enforcing a namespace label of origin. + ## Works only if enforcedNamespaceLabel set to true. + ## See https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#objectreference + excludedFromEnforcement: [] + + ## QueryLogFile specifies the file to which PromQL queries are logged. Note that this location must be writable, + ## and can be persisted using an attached volume. Alternatively, the location can be set to a stdout location such + ## as /dev/stdout to log querie information to the default Prometheus log stream. This is only available in versions + ## of Prometheus >= 2.16.0. For more details, see the Prometheus docs (https://prometheus.io/docs/guides/query-log/) + queryLogFile: false + + ## EnforcedSampleLimit defines global limit on number of scraped samples that will be accepted. This overrides any SampleLimit + ## set per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the SampleLimit to keep overall + ## number of samples/series under the desired limit. Note that if SampleLimit is lower that value will be taken instead. + enforcedSampleLimit: false + + ## EnforcedTargetLimit defines a global limit on the number of scraped targets. This overrides any TargetLimit set + ## per ServiceMonitor or/and PodMonitor. It is meant to be used by admins to enforce the TargetLimit to keep the overall + ## number of targets under the desired limit. Note that if TargetLimit is lower, that value will be taken instead, except + ## if either value is zero, in which case the non-zero value will be used. If both values are zero, no limit is enforced. + enforcedTargetLimit: false + + ## Per-scrape limit on number of labels that will be accepted for a sample. If more than this number of labels are present + ## post metric-relabeling, the entire scrape will be treated as failed. 0 means no limit. Only valid in Prometheus versions + ## 2.27.0 and newer. + enforcedLabelLimit: false + + ## Per-scrape limit on length of labels name that will be accepted for a sample. If a label name is longer than this number + ## post metric-relabeling, the entire scrape will be treated as failed. 0 means no limit. Only valid in Prometheus versions + ## 2.27.0 and newer. + enforcedLabelNameLengthLimit: false + + ## Per-scrape limit on length of labels value that will be accepted for a sample. If a label value is longer than this + ## number post metric-relabeling, the entire scrape will be treated as failed. 0 means no limit. Only valid in Prometheus + ## versions 2.27.0 and newer. + enforcedLabelValueLengthLimit: false + + ## AllowOverlappingBlocks enables vertical compaction and vertical query merge in Prometheus. This is still experimental + ## in Prometheus so it may change in any upcoming release. + allowOverlappingBlocks: false + + ## Minimum number of seconds for which a newly created pod should be ready without any of its container crashing for it to + ## be considered available. Defaults to 0 (pod will be considered available as soon as it is ready). + minReadySeconds: 0 + + # Required for use in managed kubernetes clusters (such as AWS EKS) with custom CNI (such as calico), + # because control-plane managed by AWS cannot communicate with pods' IP CIDR and admission webhooks are not working + # Use the host's network namespace if true. Make sure to understand the security implications if you want to enable it. + # When hostNetwork is enabled, this will set dnsPolicy to ClusterFirstWithHostNet automatically. + hostNetwork: false + + # HostAlias holds the mapping between IP and hostnames that will be injected + # as an entry in the pod’s hosts file. + hostAliases: [] + # - ip: 10.10.0.100 + # hostnames: + # - a1.app.local + # - b1.app.local + + additionalRulesForClusterRole: [] + # - apiGroups: [ "" ] + # resources: + # - nodes/proxy + # verbs: [ "get", "list", "watch" ] + + additionalServiceMonitors: [] + ## Name of the ServiceMonitor to create + ## + # - name: "" + + ## Additional labels to set used for the ServiceMonitorSelector. Together with standard labels from + ## the chart + ## + # additionalLabels: {} + + ## Service label for use in assembling a job name of the form