Deployed d2029a9 with MkDocs version: 1.6.1

marvin233 · marvin233 · commit b94a10d14b78 · 2024-12-16T20:19:20.000-08:00
diff --git a/pages/leaderboard/index.html b/pages/leaderboard/index.html
@@ -376,7 +376,7 @@ <h1 style="color: #4A90E2;">Leaderboard</h>
     <td>46.15</td>
     <td>36.36</td>
     <td><b>54.55</b></td>
-    <td>216.41</td>
+    <td>102.57</td>
     <td>AIOpsLab</td>
     <td>GPT 4</td>
     <td><a href="">🔗</a></td>
@@ -388,7 +388,7 @@ <h1 style="color: #4A90E2;">Leaderboard</h>
     <td>53.85</td>
     <td><b>45.45</b></td>
     <td>36.36</td>
-    <td>67.18</td>
+    <td>44.25</td>
     <td>AIOpsLab</td>
     <td>GPT 4</td>
     <td><a href="">🔗</a></td>
@@ -400,7 +400,7 @@ <h1 style="color: #4A90E2;">Leaderboard</h>
     <td><b>61.54<b></td>
     <td>40.9</td>
     <td>27.27</td>
-    <td>99.47</td>
+    <td>30.57</td>
     <td>AIOpsLab</td>
     <td>GPT 4</td>
     <td><a href="">🔗</a></td>
@@ -412,7 +412,7 @@ <h1 style="color: #4A90E2;">Leaderboard</h>
     <td>30.77</td>
     <td>9.09</td>
     <td>0</td>
-    <td>23.78</td>
+    <td>12.79</td>
     <td>AIOpsLab</td>
     <td>GPT 3.5</td>
     <td><a href="">🔗</a></td>
diff --git a/search/search_index.json b/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"AIOpsLab A Holistic Framework to Design, Benchmark, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Leaderboard \u00a0Paper \u00a0Code News <p>\ud83c\udd95 [11/2024] Checkout our arxiv paper \"AIOpsLab: A Holistic Framework for Evaluating AI Agents for Enabling Autonomous Cloud\" \ud83d\udc40         [Link] </p> <p>\ud83c\udd95  [10/2024] Our vision paper \"Building AI Agents for Autonomous Clouds: Challenges and Design Principles\" was accepted by SoCC'24 \ud83d\udc40         [Link] </p> About <p>AIOpsLab is a holistic framework to enable the design, development, and evaluation of autonomous AIOps agents that, additionally, serves the purpose of building reproducible, standardized, interoperable and scalable benchmarks. AIOpsLab can deploy microservice cloud environments, inject faults, generate workloads, and export telemetry data, while orchestrating these components and providing interfaces for interacting with and evaluating agents. Moreover, AIOpsLab provides a built-in benchmark suite with a set of problems to evaluate AIOps agents in an interactive environment. This suite can be easily extended to meet user-specific needs. </p> <p>The Orchestrator coordinates interactions between various system components and serves as the Agent-Cloud-Interface (ACI). Agents engage with the Orchestrator to solve tasks, receiving a problem description, instructions, and relevant APIs. The Orchestrator generates diverse problems using the Workload and Fault Generators, injecting these into applications it can deploy. The deployed service has observability, providing telemetry such as metrics, traces, and logs. Agents act via the Orchestrator, which executes them and updates the service's state. The Orchestrator evaluates the final solution using predefined metrics for the task.</p> BibTeX <pre><code>\n    @inproceedings{shetty2024building,\n        title = {Building AI Agents for Autonomous Clouds: Challenges and Design Principles},\n        author = {Shetty, Manish and Chen, Yinfang and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Zhang, Xuchao and Mace, Jonathan and Vandevoorde, Dax and Las-Casas, Pedro and Gupta, Shachee Mishra and Nath, Suman and Bansal, Chetan and Rajmohan, Saravan},\n        year = {2024},\n        booktitle = {Proceedings of 15th ACM Symposium on Cloud Computing (SoCC'24)},\n    }\n    @misc{chen2024aiopslab,\n        title = {AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds},\n        author = {Chen, Yinfang and Shetty, Manish and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Mace, Jonathan and Bansal, Chetan and Wang, Rujia and Rajmohan, Saravan},\n        year = {2024},\n        booktitle = {Arxiv}\n    }\n    </code>\n    </pre>"},{"location":"pages/leaderboard/","title":"Leaderboard","text":"AIOpsLab A Holistic Framework to Design, Develop, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Home \u00a0Paper \u00a0Code Leaderboard <p>   We showcase the key results on the leaderboard. If you'd like your results to appear, please email us at AIOpsLab@microsoft.com.       In the table, AVG represents the average accuracy across all tasks, while TASK1 to TASK4 correspond to the accuracy of Detection, Localization, Diagnosis, and Mitigation tasks, respectively. Time indicates the average runtime for the agents.    Agent Name Avg \u21c5 Task1 \u21c5 Task2 \u21c5 Task3 \u21c5 Task4 \u21c5 Time \u21c5 Organization Model Family Link \ud83e\udd47FLASH 59.27 100 46.15 36.36 54.55 216.41 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd48REACT 53.15 76.92 53.85 45.45 36.36 67.18 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd49GPT-4 49.74 69.23 61.54 40.9 27.27 99.47 AIOpsLab GPT 4 \ud83d\udd17 GPT-3.5 15.73 23.07 30.77 9.09 0 23.78 AIOpsLab GPT 3.5 \ud83d\udd17"}]}
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"AIOpsLab A Holistic Framework to Design, Benchmark, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Leaderboard \u00a0Paper \u00a0Code News <p>\ud83c\udd95 [11/2024] Checkout our arxiv paper \"AIOpsLab: A Holistic Framework for Evaluating AI Agents for Enabling Autonomous Cloud\" \ud83d\udc40         [Link] </p> <p>\ud83c\udd95  [10/2024] Our vision paper \"Building AI Agents for Autonomous Clouds: Challenges and Design Principles\" was accepted by SoCC'24 \ud83d\udc40         [Link] </p> About <p>AIOpsLab is a holistic framework to enable the design, development, and evaluation of autonomous AIOps agents that, additionally, serves the purpose of building reproducible, standardized, interoperable and scalable benchmarks. AIOpsLab can deploy microservice cloud environments, inject faults, generate workloads, and export telemetry data, while orchestrating these components and providing interfaces for interacting with and evaluating agents. Moreover, AIOpsLab provides a built-in benchmark suite with a set of problems to evaluate AIOps agents in an interactive environment. This suite can be easily extended to meet user-specific needs. </p> <p>The Orchestrator coordinates interactions between various system components and serves as the Agent-Cloud-Interface (ACI). Agents engage with the Orchestrator to solve tasks, receiving a problem description, instructions, and relevant APIs. The Orchestrator generates diverse problems using the Workload and Fault Generators, injecting these into applications it can deploy. The deployed service has observability, providing telemetry such as metrics, traces, and logs. Agents act via the Orchestrator, which executes them and updates the service's state. The Orchestrator evaluates the final solution using predefined metrics for the task.</p> BibTeX <pre><code>\n    @inproceedings{shetty2024building,\n        title = {Building AI Agents for Autonomous Clouds: Challenges and Design Principles},\n        author = {Shetty, Manish and Chen, Yinfang and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Zhang, Xuchao and Mace, Jonathan and Vandevoorde, Dax and Las-Casas, Pedro and Gupta, Shachee Mishra and Nath, Suman and Bansal, Chetan and Rajmohan, Saravan},\n        year = {2024},\n        booktitle = {Proceedings of 15th ACM Symposium on Cloud Computing (SoCC'24)},\n    }\n    @misc{chen2024aiopslab,\n        title = {AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds},\n        author = {Chen, Yinfang and Shetty, Manish and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Mace, Jonathan and Bansal, Chetan and Wang, Rujia and Rajmohan, Saravan},\n        year = {2024},\n        booktitle = {Arxiv}\n    }\n    </code>\n    </pre>"},{"location":"pages/leaderboard/","title":"Leaderboard","text":"AIOpsLab A Holistic Framework to Design, Develop, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Home \u00a0Paper \u00a0Code Leaderboard <p>   We showcase the key results on the leaderboard. If you'd like your results to appear, please email us at AIOpsLab@microsoft.com.       In the table, AVG represents the average accuracy across all tasks, while TASK1 to TASK4 correspond to the accuracy of Detection, Localization, Diagnosis, and Mitigation tasks, respectively. Time indicates the average runtime for the agents.    Agent Name Avg \u21c5 Task1 \u21c5 Task2 \u21c5 Task3 \u21c5 Task4 \u21c5 Time \u21c5 Organization Model Family Link \ud83e\udd47FLASH 59.27 100 46.15 36.36 54.55 102.57 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd48REACT 53.15 76.92 53.85 45.45 36.36 44.25 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd49GPT-4 49.74 69.23 61.54 40.9 27.27 30.57 AIOpsLab GPT 4 \ud83d\udd17 GPT-3.5 15.73 23.07 30.77 9.09 0 12.79 AIOpsLab GPT 3.5 \ud83d\udd17"}]}

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"AIOpsLab A Holistic Framework to Design, Benchmark, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Leaderboard \u00a0Paper \u00a0Code News <p>\ud83c\udd95 [11/2024] Checkout our arxiv paper \"AIOpsLab: A Holistic Framework for Evaluating AI Agents for Enabling Autonomous Cloud\" \ud83d\udc40 [Link] </p> <p>\ud83c\udd95 [10/2024] Our vision paper \"Building AI Agents for Autonomous Clouds: Challenges and Design Principles\" was accepted by SoCC'24 \ud83d\udc40 [Link] </p> About <p>AIOpsLab is a holistic framework to enable the design, development, and evaluation of autonomous AIOps agents that, additionally, serves the purpose of building reproducible, standardized, interoperable and scalable benchmarks. AIOpsLab can deploy microservice cloud environments, inject faults, generate workloads, and export telemetry data, while orchestrating these components and providing interfaces for interacting with and evaluating agents. Moreover, AIOpsLab provides a built-in benchmark suite with a set of problems to evaluate AIOps agents in an interactive environment. This suite can be easily extended to meet user-specific needs. </p> <p>The Orchestrator coordinates interactions between various system components and serves as the Agent-Cloud-Interface (ACI). Agents engage with the Orchestrator to solve tasks, receiving a problem description, instructions, and relevant APIs. The Orchestrator generates diverse problems using the Workload and Fault Generators, injecting these into applications it can deploy. The deployed service has observability, providing telemetry such as metrics, traces, and logs. Agents act via the Orchestrator, which executes them and updates the service's state. The Orchestrator evaluates the final solution using predefined metrics for the task.</p> BibTeX <pre><code>\n @inproceedings{shetty2024building,\n title = {Building AI Agents for Autonomous Clouds: Challenges and Design Principles},\n author = {Shetty, Manish and Chen, Yinfang and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Zhang, Xuchao and Mace, Jonathan and Vandevoorde, Dax and Las-Casas, Pedro and Gupta, Shachee Mishra and Nath, Suman and Bansal, Chetan and Rajmohan, Saravan},\n year = {2024},\n booktitle = {Proceedings of 15th ACM Symposium on Cloud Computing (SoCC'24)},\n }\n @misc{chen2024aiopslab,\n title = {AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds},\n author = {Chen, Yinfang and Shetty, Manish and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Mace, Jonathan and Bansal, Chetan and Wang, Rujia and Rajmohan, Saravan},\n year = {2024},\n booktitle = {Arxiv}\n }\n </code>\n </pre>"},{"location":"pages/leaderboard/","title":"Leaderboard","text":"AIOpsLab A Holistic Framework to Design, Develop, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Home \u00a0Paper \u00a0Code Leaderboard <p> We showcase the key results on the leaderboard. If you'd like your results to appear, please email us at [email protected]. In the table, AVG represents the average accuracy across all tasks, while TASK1 to TASK4 correspond to the accuracy of Detection, Localization, Diagnosis, and Mitigation tasks, respectively. Time indicates the average runtime for the agents. Agent Name Avg \u21c5 Task1 \u21c5 Task2 \u21c5 Task3 \u21c5 Task4 \u21c5 Time \u21c5 Organization Model Family Link \ud83e\udd47FLASH 59.27 100 46.15 36.36 54.55 216.41 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd48REACT 53.15 76.92 53.85 45.45 36.36 67.18 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd49GPT-4 49.74 69.23 61.54 40.9 27.27 99.47 AIOpsLab GPT 4 \ud83d\udd17 GPT-3.5 15.73 23.07 30.77 9.09 0 23.78 AIOpsLab GPT 3.5 \ud83d\udd17"}]}
	`1`	+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":"AIOpsLab A Holistic Framework to Design, Benchmark, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Leaderboard \u00a0Paper \u00a0Code News <p>\ud83c\udd95 [11/2024] Checkout our arxiv paper \"AIOpsLab: A Holistic Framework for Evaluating AI Agents for Enabling Autonomous Cloud\" \ud83d\udc40 [Link] </p> <p>\ud83c\udd95 [10/2024] Our vision paper \"Building AI Agents for Autonomous Clouds: Challenges and Design Principles\" was accepted by SoCC'24 \ud83d\udc40 [Link] </p> About <p>AIOpsLab is a holistic framework to enable the design, development, and evaluation of autonomous AIOps agents that, additionally, serves the purpose of building reproducible, standardized, interoperable and scalable benchmarks. AIOpsLab can deploy microservice cloud environments, inject faults, generate workloads, and export telemetry data, while orchestrating these components and providing interfaces for interacting with and evaluating agents. Moreover, AIOpsLab provides a built-in benchmark suite with a set of problems to evaluate AIOps agents in an interactive environment. This suite can be easily extended to meet user-specific needs. </p> <p>The Orchestrator coordinates interactions between various system components and serves as the Agent-Cloud-Interface (ACI). Agents engage with the Orchestrator to solve tasks, receiving a problem description, instructions, and relevant APIs. The Orchestrator generates diverse problems using the Workload and Fault Generators, injecting these into applications it can deploy. The deployed service has observability, providing telemetry such as metrics, traces, and logs. Agents act via the Orchestrator, which executes them and updates the service's state. The Orchestrator evaluates the final solution using predefined metrics for the task.</p> BibTeX <pre><code>\n @inproceedings{shetty2024building,\n title = {Building AI Agents for Autonomous Clouds: Challenges and Design Principles},\n author = {Shetty, Manish and Chen, Yinfang and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Zhang, Xuchao and Mace, Jonathan and Vandevoorde, Dax and Las-Casas, Pedro and Gupta, Shachee Mishra and Nath, Suman and Bansal, Chetan and Rajmohan, Saravan},\n year = {2024},\n booktitle = {Proceedings of 15th ACM Symposium on Cloud Computing (SoCC'24)},\n }\n @misc{chen2024aiopslab,\n title = {AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds},\n author = {Chen, Yinfang and Shetty, Manish and Somashekar, Gagan and Ma, Minghua and Simmhan, Yogesh and Mace, Jonathan and Bansal, Chetan and Wang, Rujia and Rajmohan, Saravan},\n year = {2024},\n booktitle = {Arxiv}\n }\n </code>\n </pre>"},{"location":"pages/leaderboard/","title":"Leaderboard","text":"AIOpsLab A Holistic Framework to Design, Develop, and Evaluate AI agents for Automating Service Operations at Scale M365 Research - AIOps Team \u00a0Home \u00a0Paper \u00a0Code Leaderboard <p> We showcase the key results on the leaderboard. If you'd like your results to appear, please email us at [email protected]. In the table, AVG represents the average accuracy across all tasks, while TASK1 to TASK4 correspond to the accuracy of Detection, Localization, Diagnosis, and Mitigation tasks, respectively. Time indicates the average runtime for the agents. Agent Name Avg \u21c5 Task1 \u21c5 Task2 \u21c5 Task3 \u21c5 Task4 \u21c5 Time \u21c5 Organization Model Family Link \ud83e\udd47FLASH 59.27 100 46.15 36.36 54.55 102.57 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd48REACT 53.15 76.92 53.85 45.45 36.36 44.25 AIOpsLab GPT 4 \ud83d\udd17 \ud83e\udd49GPT-4 49.74 69.23 61.54 40.9 27.27 30.57 AIOpsLab GPT 4 \ud83d\udd17 GPT-3.5 15.73 23.07 30.77 9.09 0 12.79 AIOpsLab GPT 3.5 \ud83d\udd17"}]}