-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathmkdocs.yml
More file actions
134 lines (127 loc) · 4.22 KB
/
mkdocs.yml
File metadata and controls
134 lines (127 loc) · 4.22 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
site_name: WebArena-Verified
site_url: https://servicenow.github.io/webarena-verified/
site_description: A verified version of the WebArena Benchmark with version-controlled dataset management
site_author: Amine Elhattami
repo_url: https://github.com/ServiceNow/webarena-verified
repo_name: ServiceNow/webarena-verified
docs_dir: docs
site_dir: site
theme:
name: material
logo: assets/logo-light.svg
favicon: assets/logo-dark.svg
palette:
# Light mode
- media: "(prefers-color-scheme: light)"
scheme: default
primary: indigo
accent: indigo
toggle:
icon: material/brightness-7
name: Switch to dark mode
# Dark mode
- media: "(prefers-color-scheme: dark)"
scheme: slate
primary: indigo
accent: indigo
toggle:
icon: material/brightness-4
name: Switch to light mode
features:
- navigation.tabs
- navigation.sections
- navigation.top
- search.suggest
- search.highlight
- content.code.copy
- content.code.annotate
plugins:
- search
- mkdocstrings:
handlers:
python:
options:
docstring_style: google
show_root_full_path: false
show_bases: false
show_root_heading: false
show_root_toc_entry: true
allow_inspection: false
heading_level: 3
show_source: true
members_order: source
summary: true
- redirects:
redirect_maps:
getting_started/index.md: index.md
getting_started/environments.md: environments/index.md
markdown_extensions:
- admonition
- pymdownx.details
- attr_list
- md_in_html
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.inlinehilite
- pymdownx.snippets
- pymdownx.tabbed:
alternate_style: true
- tables
- toc:
permalink: true
toc_depth: 2-3
extra:
version:
provider: mike
alias: true
extra_css:
- stylesheets/extra.css
- stylesheets/default.css
nav:
- Getting Started:
- Quick Start: index.md
- Usage:
- Evaluator: getting_started/usage.md
- Data Reader: getting_started/data_reader.md
- Data Format: getting_started/data_format.md
- Configuration: getting_started/configuration.md
- Subset Manager: getting_started/subset_manager.md
- Utilities: getting_started/utilities.md
- Changelog: changelog/index.md
- Subsets:
- WebArena-Verified Hard: getting_started/hard_subset.md
- Environments:
- Overview: environments/index.md
- Sites:
- Shopping Admin: environments/shopping_admin.md
- Shopping: environments/shopping.md
- Reddit: environments/reddit.md
- GitLab: environments/gitlab.md
- Wikipedia: environments/wikipedia.md
- Map: environments/map.md
- Environment Control: environments/environment_control.md
- Evaluation:
- Quick Start: evaluation/index.md
- Handling of Unachievable Tasks: evaluation/handling_of_unachievable_tasks.md
- Removing LLM-Based Evaluation: evaluation/removing_llm_based_evaluation.md
- Network Event Based Evaluation: evaluation/network_event_based_evaluation.md
- Evaluation Results: evaluation/evaluation_results.md
- API Reference:
- Overview: api_reference/index.md
- WebArenaVerified: api_reference/webarena_verified.md
- Data Types:
- Overview: api_reference/data_types/index.md
- Configuration: api_reference/data_types/config.md
- Task: api_reference/data_types/task.md
- TaskEvalContext: api_reference/data_types/task_eval_context.md
- WebArenaSite: api_reference/data_types/web_arena_site.md
- Agent Response: api_reference/data_types/agent_response.md
- AgentResponseEvaluatorCfg: api_reference/data_types/agent_response_evaluator_cfg.md
- Evaluators:
- Agent Response: api_reference/evaluators/agent_response_evaluator.md
- Network Event: api_reference/evaluators/network_event_evaluator.md