You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+37-2
Original file line number
Diff line number
Diff line change
@@ -9,16 +9,19 @@
9
9
DevOps-Eval is a comprehensive evaluation suite specifically designed for foundation models in the DevOps field. We hope DevOps-Eval could help developers, especially in the DevOps field, track the progress and analyze the important strengths/shortcomings of their models.
10
10
11
11
12
-
📚 This repo contains questions and exercises related to DevOps, including the AIOps.
12
+
📚 This repo contains questions and exercises related to DevOps, including the AIOps, ToolLearning;
13
13
14
-
💥️ There are currently **5977** multiple-choice questions spanning 8 diverse general categories, as shown [below](images/data_info.png).
14
+
💥️ There are currently **7486** multiple-choice questions spanning 8 diverse general categories, as shown [below](images/data_info.png).
15
15
16
16
🔥 There are a total of **2840** samples in the AIOps subcategory, covering scenarios such as **log parsing**, **time series anomaly detection**, **time series classification**, **time series forecasting**, and **root cause analysis**.
17
17
18
+
🔧 There are a total of **1509** samples in the ToolLearning subcategory, covering 239 tool scenes across 59 fields.
explanation: According to the analysis, the value 265in the given time series at 12 o'clock is significantly larger than the surrounding data, indicating a sudden increase phenomenon. Therefore, selecting option D is correct.
218
249
```
250
+
#### 🔧 ToolLearning Sample Example
251
+
252
+
👀 👀The data formatis compatible with OpenAI's Function Calling. Please refer to [category_mapping.json](resources/categroy_mapping.json) for details.
219
253
220
254
221
255
## 🚀 How to Evaluate
@@ -289,6 +323,7 @@ python src/run_eval.py \
289
323
## 🧭 TODO
290
324
-[x] add AIOps samples.
291
325
-[x] add AIOps scenario **time series forecasting**.
326
+
-[x] add **ToolLearning** samples.
292
327
-[ ] increase in sample size.
293
328
-[ ] add samples with the difficulty level set to hard.
0 commit comments