Skip to content

Commit 535cfb6

Browse files
authored
Merge pull request #83 from SamYuan1990/zhkepler-helm
update install for zh
2 parents 93d9f5c + 03a39de commit 535cfb6

File tree

3 files changed

+251
-0
lines changed

3 files changed

+251
-0
lines changed
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Kepler 社区 Operator on OpenShift
2+
3+
## 需求
4+
5+
请确定您拥有:
6+
7+
- 一个OCP 4.13集群
8+
-`kubeadmin` 或者 `cluster-admin` 权限的用户。
9+
- `oc` 命令.
10+
- 下载了`kepler-operator`[repository](https://github.com/sustainable-computing-io/kepler-operator).
11+
```sh
12+
git clone https://github.com/sustainable-computing-io/kepler-operator.git
13+
cd kepler-exporter
14+
```
15+
---
16+
## 从Operator Hub安装operator
17+
18+
1. 选中Operators > OperatorHub. 搜索 `Kepler`. 点击 `Install`
19+
![](../fig/ocp_installation/operator_installation_ocp_1.png)
20+
21+
2. 允许安装
22+
![](../fig/ocp_installation/operator_installation_ocp_7.png)
23+
24+
3. 创建Kepler的Custom Resource
25+
![](../fig/ocp_installation/operator_installation_ocp_2.png)
26+
> 注意:当前的OCP控制台可能会显示一个JavaScript错误(预计将在4.13.5中修复),但它不会影响其余步骤。修复程序目前可在4.13.0-0.nightly-2023-07-08-165124版本的OCP控制台上获得。
27+
28+
---
29+
## 安装Grafana operator
30+
31+
### 部署Grafana Operator
32+
33+
当前API Bearer令牌需要在`GrafanaDataSource`清单中更新,以便`Grafana DataSource`可以向Prometheus进行身份验证。以下命令将更新清单并在命名空间`kepler-operator-system`中部署Grafana Operator
34+
35+
```sh
36+
BEARER_TOKEN=$(oc whoami --show-token)
37+
hack/dashboard/openshift/deploy-grafana.sh
38+
```
39+
> 注意:脚本要求您位于顶级目录中,因此请确保您位于`kepler-operator`根目录中。使用命令`cd $(git rev-parse --show-toplevel)`
40+
41+
### 访问Garafana Console
42+
配置Networking > Routes.
43+
![](../fig/ocp_installation/operator_installation_ocp_5.png)
44+
45+
### Grafana Dashboard
46+
使用密钥`kepler:kepler`登陆Grafana Dashboard.
47+
![](../fig/ocp_installation/operator_installation_ocp_6.png)
48+
49+
---
50+
51+
## 故障排除
52+
53+
> 注意:如果数据源出现问题,请检查API令牌是否已正确更新
54+
55+
![](../fig/ocp_installation/operator_installation_ocp_3.png)

docs/installation/kepler-helm.zh.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# 通过Helm Chart部署kepler
2+
3+
Kepler的Helm Chart目前在[GitHub](https://github.com/sustainable-computing-io/kepler-helm-chart/tree/main)[ArtifactHub](https://artifacthub.io/packages/helm/kepler/kepler)上可用了。
4+
5+
## 安装Helm
6+
作为准备工作您必须先安装[Helm](https://helm.sh)才可以使用Helm Chart来安装kepler。
7+
您可以参考Helm的[文档](https://helm.sh/docs/)来进行安装。
8+
9+
10+
## 添加Kepler Helm仓库
11+
12+
执行命令:
13+
14+
```bash
15+
helm repo add kepler https://sustainable-computing-io.github.io/kepler-helm-chart
16+
```
17+
18+
您可以通过以下命令找到最新版本
19+
20+
```bash
21+
helm search repo kepler
22+
```
23+
24+
您可以执行以下命令来测试并检查生成的用于安装的配置文件。
25+
26+
```bash
27+
helm install kepler kepler/kepler --namespace kepler --create-namespace --dry-run --devel
28+
```
29+
30+
## 安装Kepler
31+
32+
执行命令:
33+
34+
```bash
35+
helm install kepler kepler/kepler --namespace kepler --create-namespace
36+
```
37+
38+
>您也许需要改变环境变量来适配您的实际情况[values.yaml](https://github.com/sustainable-computing-io/kepler-helm-chart/blob/main/chart/kepler/values.yaml).
39+
40+
并通过以下命令来使得改动生效
41+
42+
```bash
43+
helm install kepler kepler/kepler --values values.yaml --namespace kepler --create-namespace
44+
```
45+
46+
下表列出了配置参数的定义和默认值。
47+
48+
Parameter|Description| Default
49+
---|---|---
50+
global.namespace| Kubernetes namespace for kepler |kepler
51+
image.repository|Repository for Kepler Image| quay.io/sustainable\_computing\_io/kepler
52+
image.pullPolicy|Pull policy for Kepler|Always
53+
image.tag|Image tag for Kepler Image |latest
54+
serviceAccount.name|Service account name for Kepler|kepler-sa
55+
service.type|Kepler service type|ClusterIP
56+
service.port|Kepler service exposed port|9102
57+
58+
## 卸载 Kepler
59+
您可以通过以下命令卸载
60+
```bash
61+
helm delete --purge kepler --tiller-namespace <namespace>
62+
```
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
# 通过Kepler Operator在Kind上安装
2+
3+
## 需求:
4+
5+
在开始前请确认您已经安装了:
6+
7+
- `kubectl`
8+
- 下载了`kepler-operator`[repository](https://github.com/sustainable-computing-io/kepler-operator)
9+
- 目标k8s集群。您可以使用Kind来简单构建一个本地k8s集群来体验本教程。[local cluster for testing](#run-a-kind-cluster-locally), 或直接在您远端的k8s集群执行。注意您的controller将会自动使用当前的kubeconfig配置文件。您可以通过`kubectl cluster-info`来查看。
10+
-`kubeadmin` 或者 `cluster-admin` 权限的用户。
11+
12+
### 启动一个本地kind集群
13+
14+
``` sh
15+
cd kepler-operator
16+
make cluster-up CLUSTER_PROVIDER='kind' CI_DEPLOY=true GRAFANA_ENABLE=true
17+
18+
kubectl get pods -n monitoring
19+
20+
grafana-b88df6989-km7c6 1/1 Running 0 48m
21+
prometheus-k8s-0 2/2 Running 0 46m
22+
prometheus-operator-6bd88c8bdf-9f69h 2/2 Running 0 48m
23+
```
24+
25+
## 启动kepler-operator
26+
- 您可以通过quay.io上的image来部署kepler-operator.
27+
28+
```sh
29+
make deploy IMG=quay.io/sustainable_computing_io/kepler-operator:latest
30+
kubectl config set-context --current --namespace=monitoring
31+
kubectl apply -k config/samples/
32+
```
33+
- 通过`kubectl get pods -n monitoring`命令来验证`kepler-exporter`pod的部署情况。
34+
35+
36+
## 设置Grafana Dashboard
37+
38+
使用`GRAFANA_ENABLE=true` 来配置`kube-prometheus`在命名空间`monitoring`上的部署.
39+
通过以下命令来访问位于3000端口的grafana界面。
40+
41+
```sh
42+
kubectl port-forward svc/grafana 3000:3000 -n monitoring
43+
```
44+
45+
>并通过以下域名访问[http://localhost:3000](http://localhost:3000)
46+
47+
### Service Monitor
48+
49+
`kube-prometheus` 使用 `kepler-exporter` 服务端口进行监控,您需要配置service monitor.
50+
51+
> Note: 默认情况下`kube-prometheus` 不会捕捉`monitoring`命名空间之外的服务. 如果您的kepler部署在`monitoring`空间之外[请看考以下步骤](#scrape-all-namespaces).
52+
53+
```
54+
kubectl apply -n monitoring -f - <<
55+
apiVersion: monitoring.coreos.com/v1
56+
kind: ServiceMonitor
57+
metadata:
58+
labels:
59+
app.kubernetes.io/component: exporter
60+
app.kubernetes.io/name: kepler-exporter
61+
sustainable-computing.io/app: kepler
62+
name: monitor-kepler-exporter
63+
spec:
64+
endpoints:
65+
- interval: 3s
66+
port: http
67+
relabelings:
68+
- action: replace
69+
regex: (.*)
70+
replacement: $1
71+
sourceLabels:
72+
- __meta_kubernetes_pod_node_name
73+
targetLabel: instance
74+
scheme: http
75+
jobLabel: app.kubernetes.io/name
76+
namespaceSelector:
77+
matchNames:
78+
any: true
79+
selector:
80+
matchLabels:
81+
app.kubernetes.io/component: exporter
82+
app.kubernetes.io/name: kepler-exporter
83+
EOF
84+
```
85+
86+
### Grafana Dashboard
87+
88+
通过以下步骤配置Grafana:
89+
90+
- 登陆[localhost:3000](http:localhost:3000)默认用户名/密码为`admin:admin`
91+
- 倒入默认[dashboard](https://raw.githubusercontent.com/sustainable-computing-io/kepler/main/grafana-dashboards/Kepler-Exporter.json)
92+
93+
![](../fig/ocp_installation/kind_grafana.png)
94+
95+
### 卸载operator
96+
通过以下命令卸载:
97+
```sh
98+
make undeploy
99+
```
100+
101+
[参考这里](https://github.com/sustainable-computing-io/kepler-operator#getting-started) 来让kepler operator运行在kind集群上。
102+
103+
## 错误排查
104+
105+
### 监控所有的命名空间
106+
107+
kube-prometheus默认不会监控所有的命名空间,这是由于RBAC控制的。
108+
以下clusterrole `prometheus-k8s`的配置讲允许kube-prometheus监控所有命名空间。
109+
110+
```sh
111+
oc describe clusterrole prometheus-k8s
112+
Name: prometheus-k8s
113+
Labels: app.kubernetes.io/component=prometheus
114+
app.kubernetes.io/instance=k8s
115+
app.kubernetes.io/name=prometheus
116+
app.kubernetes.io/part-of=kube-prometheus
117+
app.kubernetes.io/version=2.45.0
118+
Annotations: <none>
119+
PolicyRule:
120+
Resources Non-Resource URLs Resource Names Verbs
121+
--------- ----------------- -------------- -----
122+
endpoints [] [] [get list watch]
123+
pods [] [] [get list watch]
124+
services [] [] [get list watch]
125+
ingresses.networking.k8s.io [] [] [get list watch]
126+
[/metrics] [] [get]
127+
nodes/metrics [] [] [get]
128+
129+
```
130+
131+
- 在创建[local cluster](#run-a-kind-cluster-locally)定制prometheus,请参考
132+
kube-prometheus文档[Customizing Kube-Prometheus](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizing.md)
133+
134+
- 请确定您应用了[this jsonnet](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizations/monitoring-all-namespaces.md)保证prometheus监控所有命名空间。

0 commit comments

Comments
 (0)