Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update install for zh #83

Merged
merged 1 commit into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions docs/installation/community-operator.zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Kepler 社区 Operator on OpenShift

## 需求

请确定您拥有:

- 一个OCP 4.13集群
- 有`kubeadmin` 或者 `cluster-admin` 权限的用户。
- `oc` 命令.
- 下载了`kepler-operator`[repository](https://github.com/sustainable-computing-io/kepler-operator).
```sh
git clone https://github.com/sustainable-computing-io/kepler-operator.git
cd kepler-exporter
```
---
## 从Operator Hub安装operator

1. 选中Operators > OperatorHub. 搜索 `Kepler`. 点击 `Install`
![](../fig/ocp_installation/operator_installation_ocp_1.png)

2. 允许安装
![](../fig/ocp_installation/operator_installation_ocp_7.png)

3. 创建Kepler的Custom Resource
![](../fig/ocp_installation/operator_installation_ocp_2.png)
> 注意:当前的OCP控制台可能会显示一个JavaScript错误(预计将在4.13.5中修复),但它不会影响其余步骤。修复程序目前可在4.13.0-0.nightly-2023-07-08-165124版本的OCP控制台上获得。

---
## 安装Grafana operator

### 部署Grafana Operator

当前API Bearer令牌需要在`GrafanaDataSource`清单中更新,以便`Grafana DataSource`可以向Prometheus进行身份验证。以下命令将更新清单并在命名空间`kepler-operator-system`中部署Grafana Operator

```sh
BEARER_TOKEN=$(oc whoami --show-token)
hack/dashboard/openshift/deploy-grafana.sh
```
> 注意:脚本要求您位于顶级目录中,因此请确保您位于`kepler-operator`根目录中。使用命令`cd $(git rev-parse --show-toplevel)`

### 访问Garafana Console
配置Networking > Routes.
![](../fig/ocp_installation/operator_installation_ocp_5.png)

### Grafana Dashboard
使用密钥`kepler:kepler`登陆Grafana Dashboard.
![](../fig/ocp_installation/operator_installation_ocp_6.png)

---

## 故障排除

> 注意:如果数据源出现问题,请检查API令牌是否已正确更新

![](../fig/ocp_installation/operator_installation_ocp_3.png)
62 changes: 62 additions & 0 deletions docs/installation/kepler-helm.zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# 通过Helm Chart部署kepler

Kepler的Helm Chart目前在[GitHub](https://github.com/sustainable-computing-io/kepler-helm-chart/tree/main)和[ArtifactHub](https://artifacthub.io/packages/helm/kepler/kepler)上可用了。

## 安装Helm
作为准备工作您必须先安装[Helm](https://helm.sh)才可以使用Helm Chart来安装kepler。
您可以参考Helm的[文档](https://helm.sh/docs/)来进行安装。


## 添加Kepler Helm仓库

执行命令:

```bash
helm repo add kepler https://sustainable-computing-io.github.io/kepler-helm-chart
```

您可以通过以下命令找到最新版本

```bash
helm search repo kepler
```

您可以执行以下命令来测试并检查生成的用于安装的配置文件。

```bash
helm install kepler kepler/kepler --namespace kepler --create-namespace --dry-run --devel
```

## 安装Kepler

执行命令:

```bash
helm install kepler kepler/kepler --namespace kepler --create-namespace
```

>您也许需要改变环境变量来适配您的实际情况[values.yaml](https://github.com/sustainable-computing-io/kepler-helm-chart/blob/main/chart/kepler/values.yaml).

并通过以下命令来使得改动生效

```bash
helm install kepler kepler/kepler --values values.yaml --namespace kepler --create-namespace
```

下表列出了配置参数的定义和默认值。

Parameter|Description| Default
---|---|---
global.namespace| Kubernetes namespace for kepler |kepler
image.repository|Repository for Kepler Image| quay.io/sustainable\_computing\_io/kepler
image.pullPolicy|Pull policy for Kepler|Always
image.tag|Image tag for Kepler Image |latest
serviceAccount.name|Service account name for Kepler|kepler-sa
service.type|Kepler service type|ClusterIP
service.port|Kepler service exposed port|9102

## 卸载 Kepler
您可以通过以下命令卸载
```bash
helm delete --purge kepler --tiller-namespace <namespace>
```
134 changes: 134 additions & 0 deletions docs/installation/kepler-operator.zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# 通过Kepler Operator在Kind上安装

## 需求:

在开始前请确认您已经安装了:

- `kubectl`
- 下载了`kepler-operator`[repository](https://github.com/sustainable-computing-io/kepler-operator)
- 目标k8s集群。您可以使用Kind来简单构建一个本地k8s集群来体验本教程。[local cluster for testing](#run-a-kind-cluster-locally), 或直接在您远端的k8s集群执行。注意您的controller将会自动使用当前的kubeconfig配置文件。您可以通过`kubectl cluster-info`来查看。
- 有`kubeadmin` 或者 `cluster-admin` 权限的用户。

### 启动一个本地kind集群

``` sh
cd kepler-operator
make cluster-up CLUSTER_PROVIDER='kind' CI_DEPLOY=true GRAFANA_ENABLE=true

kubectl get pods -n monitoring

grafana-b88df6989-km7c6 1/1 Running 0 48m
prometheus-k8s-0 2/2 Running 0 46m
prometheus-operator-6bd88c8bdf-9f69h 2/2 Running 0 48m
```

## 启动kepler-operator
- 您可以通过quay.io上的image来部署kepler-operator.

```sh
make deploy IMG=quay.io/sustainable_computing_io/kepler-operator:latest
kubectl config set-context --current --namespace=monitoring
kubectl apply -k config/samples/
```
- 通过`kubectl get pods -n monitoring`命令来验证`kepler-exporter`pod的部署情况。


## 设置Grafana Dashboard

使用`GRAFANA_ENABLE=true` 来配置`kube-prometheus`在命名空间`monitoring`上的部署.
通过以下命令来访问位于3000端口的grafana界面。

```sh
kubectl port-forward svc/grafana 3000:3000 -n monitoring
```

>并通过以下域名访问[http://localhost:3000](http://localhost:3000)

### Service Monitor

让`kube-prometheus` 使用 `kepler-exporter` 服务端口进行监控,您需要配置service monitor.

> Note: 默认情况下`kube-prometheus` 不会捕捉`monitoring`命名空间之外的服务. 如果您的kepler部署在`monitoring`空间之外[请看考以下步骤](#scrape-all-namespaces).

```
kubectl apply -n monitoring -f - <<
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kepler-exporter
sustainable-computing.io/app: kepler
name: monitor-kepler-exporter
spec:
endpoints:
- interval: 3s
port: http
relabelings:
- action: replace
regex: (.*)
replacement: $1
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: instance
scheme: http
jobLabel: app.kubernetes.io/name
namespaceSelector:
matchNames:
any: true
selector:
matchLabels:
app.kubernetes.io/component: exporter
app.kubernetes.io/name: kepler-exporter
EOF
```

### Grafana Dashboard

通过以下步骤配置Grafana:

- 登陆[localhost:3000](http:localhost:3000)默认用户名/密码为`admin:admin`
- 倒入默认[dashboard](https://raw.githubusercontent.com/sustainable-computing-io/kepler/main/grafana-dashboards/Kepler-Exporter.json)

![](../fig/ocp_installation/kind_grafana.png)

### 卸载operator
通过以下命令卸载:
```sh
make undeploy
```

[参考这里](https://github.com/sustainable-computing-io/kepler-operator#getting-started) 来让kepler operator运行在kind集群上。

## 错误排查

### 监控所有的命名空间

kube-prometheus默认不会监控所有的命名空间,这是由于RBAC控制的。
以下clusterrole `prometheus-k8s`的配置讲允许kube-prometheus监控所有命名空间。

```sh
oc describe clusterrole prometheus-k8s
Name: prometheus-k8s
Labels: app.kubernetes.io/component=prometheus
app.kubernetes.io/instance=k8s
app.kubernetes.io/name=prometheus
app.kubernetes.io/part-of=kube-prometheus
app.kubernetes.io/version=2.45.0
Annotations: <none>
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
endpoints [] [] [get list watch]
pods [] [] [get list watch]
services [] [] [get list watch]
ingresses.networking.k8s.io [] [] [get list watch]
[/metrics] [] [get]
nodes/metrics [] [] [get]

```

- 在创建[local cluster](#run-a-kind-cluster-locally)定制prometheus,请参考
kube-prometheus文档[Customizing Kube-Prometheus](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizing.md)

- 请确定您应用了[this jsonnet](https://github.com/prometheus-operator/kube-prometheus/blob/main/docs/customizations/monitoring-all-namespaces.md)保证prometheus监控所有命名空间。