You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The RAI Bench is a package including benchmarks and providing frame for creating new benchmarks
6
+
7
+
## Frame Components
8
+
9
+
Frame components can be found in `src/rai_bench/rai_bench/benchmark_model.py`
10
+
11
+
-`Task` - abstract class for creating specific task. It introduces helper funtions that make it easier to calculate metrics/scores. Your custom tasks must implement a prompt got agent to do, a way to calculate a result and a validation if given scene config suits the task.
12
+
-
13
+
-`Scenario` - class defined by a Scene and Task. Can be created manually like:
14
+
15
+
```python
16
+
17
+
```
18
+
19
+
-`Benchmark` - class responsible for running and logging scenarios.
20
+
21
+
### O3DE TEST BENCHMARK
22
+
23
+
O3DE Test Benchmark (src/rai_bench/rai_bench/o3de_test_bench/), contains 2 Tasks(tasks/) - GrabCarrotTask and PlaceCubesTask (these tasks implement calculating scores) and 4 scene_configs(configs/) for O3DE robotic arm simulation.
24
+
25
+
Both tasks calculate score, taking into consideration 4 values:
26
+
27
+
- initially_misplaced_now_correct - when the object which was in the incorrect place at the start, is in a correct place at the end
28
+
- initially_misplaced_still_incorrect - when the object which was in the incorrect place at the start, is in a incorrect place at the end
29
+
- initially_correct_still_correct - when the object which was in the correct place at the start, is in a correct place at the end
30
+
- initially_correct_now_incorrect - when the object which was in the correct place at the start, is in a incorrect place at the end
31
+
32
+
The result is a value between 0 and 1, calculated like (initially_misplaced_now_correct + initially_correct_still_correct) / number_of_initial_objects.
33
+
This score is calculated at the beggining and at the end of each scenario.
34
+
35
+
### Example usage
36
+
37
+
Example of how to load scenes, define scenarios and run benchmark can be found in `src/rai_bench/rai_bench/benchmark_main.py`
0 commit comments