Skip to content

Commit 1610ca6

Browse files
committed
chore: update links in the docs with new ab_test.py location
Update links in the docs with new ab_test.py location Signed-off-by: Egor Lazarchuk <[email protected]>
1 parent afd72c3 commit 1610ca6

File tree

1 file changed

+14
-14
lines changed

1 file changed

+14
-14
lines changed

tests/README.md

+14-14
Original file line numberDiff line numberDiff line change
@@ -150,13 +150,13 @@ post-merge. Specific tests, such as our
150150
[snapshot restore latency tests](integration_tests/performance/test_snapshot_ab.py)
151151
contain no assertions themselves, but rather they emit data series using the
152152
`aws_embedded_metrics` library. When executed by the
153-
[`tools/ab_test.py`](../tools/ab_test.py) orchestration script, these data
153+
[`tools/ab/ab_test.py`](../tools/ab/ab_test.py) orchestration script, these data
154154
series are collected. The orchestration script executes each test twice with
155155
different Firecracker binaries, and then matches up corresponding data series
156156
from the _A_ and _B_ run. For each data series, it performs a non-parametric
157157
test. For each data series where the difference between the _A_ and _B_ run is
158158
considered statically significant, it will print out the associated metric.
159-
Please see `tools/ab_test.py --help` for information on how to configure what
159+
Please see `tools/ab/ab_test.py --help` for information on how to configure what
160160
the script considers significant.
161161

162162
Writing your own A/B-Test is easy: Simply write a test that outputs a data
@@ -193,12 +193,12 @@ metric for which they wish to support A/B-testing**. This is because
193193
non-parametric tests operate on data series instead of individual data points.
194194

195195
When emitting metrics with `aws_embedded_metrics`, each metric (data series) is
196-
associated with a set of dimensions. The `tools/ab_test.py` script uses these
196+
associated with a set of dimensions. The `tools/ab/ab_test.py` script uses these
197197
dimension to match up data series between two test runs. It only matches up two
198198
data series with the same name if their dimensions match.
199199

200200
Special care needs to be taken when pytest expands the argument passed to
201-
`tools/ab_test.py`'s `--test` option into multiple individual test cases. If two
201+
`tools/ab/ab_test.py`'s `--test` option into multiple individual test cases. If two
202202
test cases use the same dimensions for different data series, the script will
203203
fail and print out the names of the violating data series. For this reason,
204204
**A/B-Compatible tests should include a `performance_test` key in their
@@ -208,22 +208,22 @@ In addition to the above, care should be taken that the dimensions of the data
208208
series emitted by some test case are unique to that test case. For example, if
209209
we have a boottime test parameterized by number of vcpus, but the emitted
210210
boottime data series' dimension set is just
211-
`{"performance_test": "test_boottime"}`, then `tools/ab_test.py` will not be
211+
`{"performance_test": "test_boottime"}`, then `tools/ab/ab_test.py` will not be
212212
able to tell apart data series belonging to different microVM sizes, and instead
213213
combine them (which is probably not desired). For this reason **A/B-Compatible
214214
tests should always include all pytest parameters in their dimension set.**
215215

216-
Lastly, performance A/B-Testing through `tools/ab_test.py` can only detect
216+
Lastly, performance A/B-Testing through `tools/ab/ab_test.py` can only detect
217217
performance differences that are present in the Firecracker binary. The
218-
`tools/ab_test.py` script only checks out the revisions it is passed to execute
218+
`tools/ab/ab_test.py` script only checks out the revisions it is passed to execute
219219
`cargo build` to generate a Firecracker binary. It does not run integration
220220
tests in the context of the checked out revision. In particular, both the _A_
221221
and the _B_ run will be triggered from within the same docker container, and
222222
using the same revision of the integration test code. This means it is not
223223
possible to use orchestrated A/B-Testing to assess the impact of, say, changing
224224
only python code (such as enabling logging). Only Rust code can be A/B-Tested.
225225
The exception to this are toolchain differences. If both specified revisions
226-
have `rust-toolchain.toml` files, then `tools/ab_test.py` will compile using the
226+
have `rust-toolchain.toml` files, then `tools/ab/ab_test.py` will compile using the
227227
toolchain specified by the revision, instead of the toolchain installed in the
228228
docker container from which the script is executed.
229229

@@ -256,25 +256,25 @@ This instructs `aws_embedded_metrics` to dump all data series that our A/B-Test
256256
orchestration would analyze to `stdout`, and pytest will capture this output
257257
into a file stored at `./test_results/test-report.json`.
258258

259-
The `tools/ab_test.py` script can consume these test reports, so next collect
259+
The `tools/ab/ab_test.py` script can consume these test reports, so next collect
260260
your two test report files to your local machine and run
261261

262262
```sh
263-
tools/ab_test.py analyze <first test-report.json> <second test-report.json>
263+
tools/ab/ab_test.py analyze <first test-report.json> <second test-report.json>
264264
```
265265

266266
This will then print the same analysis described in the previous sections.
267267

268268
#### Troubleshooting
269269

270-
If during `tools/ab_test.py analyze` you get an error like
270+
If during `tools/ab/ab_test.py analyze` you get an error like
271271

272272
```bash
273-
$ tools/ab_test.py analyze <first test-report.json> <second test-report.json>
273+
$ tools/ab/ab_test.py analyze <first test-report.json> <second test-report.json>
274274
Traceback (most recent call last):
275-
File "/firecracker/tools/ab_test.py", line 412, in <module>
275+
File "/firecracker/tools/ab/ab_test.py", line 412, in <module>
276276
data_a = load_data_series(args.report_a)
277-
File "/firecracker/tools/ab_test.py", line 122, in load_data_series
277+
File "/firecracker/tools/ab/ab_test.py", line 122, in load_data_series
278278
for line in test["teardown"]["stdout"].splitlines():
279279
KeyError: 'stdout'
280280
```

0 commit comments

Comments
 (0)