Open
Description
Here's a sequence of benchmark runs on the same code (bd165b0) using tasty-bench
's --fail-faster
and --fail-slower
flags to highlight differing results:
$ cabal bench --benchmark-options "--stdev=1 --timeout=10 --csv=bench-0.csv"
<snip>
$ cabal bench --benchmark-options "--stdev=1 --timeout=10 --csv=bench-1.csv --baseline=bench-0.csv --fail-if-slower=5 --fail-if-faster=5 --hide-successes"
All
Map
insert
String: FAIL (0.95s)
1.63 ms ± 25 μs, 5% faster than baseline
Use -p '/All.Map.insert.String/' to rerun this test only.
ByteStringString: FAIL (0.86s)
1.45 ms ± 23 μs, 5% faster than baseline
Use -p '/All.Map.insert.ByteStringString/' to rerun this test only.
fromList
ByteString: FAIL (0.86s)
1.46 ms ± 28 μs, 5% faster than baseline
Use -p '/All.Map.fromList.ByteString/' to rerun this test only.
hashmap/Map
delete-miss
ByteString: FAIL (0.78s)
650 μs ± 11 μs, 10% faster than baseline
Use -p '/hashmap\/Map.delete-miss.ByteString/' to rerun this test only.
IntMap
lookup-miss: FAIL (1.54s)
338 μs ± 2.7 μs, 5% slower than baseline
Use -p '/IntMap.lookup-miss/' to rerun this test only.
delete-miss: FAIL (1.34s)
582 μs ± 6.1 μs, 9% faster than baseline
Use -p '/IntMap.delete-miss/' to rerun this test only.
HashMap
lookup-miss
ByteString: FAIL (1.09s)
112 μs ± 1.6 μs, 5% slower than baseline
Use -p '/HashMap.lookup-miss.ByteString/' to rerun this test only.
Int: FAIL (1.01s)
102 μs ± 1.6 μs, 9% slower than baseline
Use -p '/lookup-miss.Int/' to rerun this test only.
insert
ByteString: FAIL (1.21s)
523 μs ± 6.3 μs, 19% faster than baseline
Use -p '/HashMap.insert.ByteString/' to rerun this test only.
Int: FAIL (1.11s)
469 μs ± 5.6 μs, 13% faster than baseline
Use -p '/insert.Int/' to rerun this test only.
insert-dup
Int: FAIL (0.96s)
398 μs ± 5.7 μs, 14% faster than baseline
Use -p '/insert-dup.Int/' to rerun this test only.
delete
String: FAIL (0.90s)
754 μs ± 11 μs, 12% faster than baseline
Use -p '/HashMap.delete.String/' to rerun this test only.
delete-miss
String: FAIL (0.97s)
205 μs ± 3.0 μs, 5% faster than baseline
Use -p '/HashMap.delete-miss.String/' to rerun this test only.
ByteString: FAIL (0.77s)
149 μs ± 2.6 μs, 7% slower than baseline
Use -p '/HashMap.delete-miss.ByteString/' to rerun this test only.
Int: FAIL (1.33s)
289 μs ± 2.8 μs, 5% slower than baseline
Use -p '/delete-miss.Int/' to rerun this test only.
alterInsert
ByteString: FAIL (1.31s)
580 μs ± 7.2 μs, 18% faster than baseline
Use -p '/alterInsert.ByteString/' to rerun this test only.
Int: FAIL (1.19s)
505 μs ± 5.9 μs, 21% faster than baseline
Use -p '/alterInsert.Int/' to rerun this test only.
alterFInsert
String: FAIL (4.86s)
570 μs ± 1.5 μs, 15% faster than baseline
Use -p '/alterFInsert.String/' to rerun this test only.
ByteString: FAIL (1.21s)
518 μs ± 5.8 μs, 20% faster than baseline
Use -p '/alterFInsert.ByteString/' to rerun this test only.
Int: FAIL (1.10s)
465 μs ± 7.9 μs, 22% faster than baseline
Use -p '/alterFInsert.Int/' to rerun this test only.
alterFInsert-dup
Int: FAIL (0.94s)
387 μs ± 5.8 μs, 15% faster than baseline
Use -p '/alterFInsert-dup.Int/' to rerun this test only.
alterFDelete-miss
String: FAIL (0.96s)
203 μs ± 2.9 μs, 5% faster than baseline
Use -p '/alterFDelete-miss.String/' to rerun this test only.
ByteString: FAIL (0.76s)
148 μs ± 2.8 μs, 6% slower than baseline
Use -p '/alterFDelete-miss.ByteString/' to rerun this test only.
fromListWith
long
String: FAIL (0.92s)
387 μs ± 7.0 μs, 6% faster than baseline
Use -p '/fromListWith.long.String/' to rerun this test only.
24 out of 118 tests failed (184.35s)
$ cabal bench --benchmark-options "--stdev=1 --timeout=10 --csv=bench-2.csv --baseline=bench-1.csv --fail-if-slower=5 --fail-if-faster=5 --hide-successes"
All
hashmap/Map
delete
ByteString: FAIL (0.82s)
677 μs ± 11 μs, 8% faster than baseline
Use -p '/hashmap\/Map.delete.ByteString/' to rerun this test only.
IntMap
delete: FAIL (1.16s)
495 μs ± 5.5 μs, 9% faster than baseline
Use -p '$0=="All.IntMap.delete"' to rerun this test only.
HashMap
delete
ByteString: FAIL (1.38s)
610 μs ± 5.5 μs, 15% faster than baseline
Use -p '/HashMap.delete.ByteString/' to rerun this test only.
Int: FAIL (1.06s)
444 μs ± 5.7 μs, 12% faster than baseline
Use -p '/delete.Int/' to rerun this test only.
delete-miss
Int: FAIL (2.30s)
260 μs ± 1.4 μs, 10% faster than baseline
Use -p '/delete-miss.Int/' to rerun this test only.
alterInsert-dup
Int: FAIL (1.02s)
429 μs ± 5.3 μs, 13% faster than baseline
Use -p '/alterInsert-dup.Int/' to rerun this test only.
alterDelete
String: FAIL (0.91s)
760 μs ± 11 μs, 11% faster than baseline
Use -p '/alterDelete.String/' to rerun this test only.
ByteString: FAIL (0.79s)
637 μs ± 12 μs, 13% faster than baseline
Use -p '/alterDelete.ByteString/' to rerun this test only.
Int: FAIL (1.08s)
453 μs ± 6.0 μs, 11% faster than baseline
Use -p '/alterDelete.Int/' to rerun this test only.
alterFDelete
String: FAIL (0.88s)
745 μs ± 11 μs, 12% faster than baseline
Use -p '/alterFDelete.String/' to rerun this test only.
ByteString: FAIL (1.40s)
609 μs ± 5.4 μs, 15% faster than baseline
Use -p '/alterFDelete.ByteString/' to rerun this test only.
Int: FAIL (1.04s)
446 μs ± 6.1 μs, 10% faster than baseline
Use -p '/alterFDelete.Int/' to rerun this test only.
alterDelete-miss
Int: FAIL (1.25s)
269 μs ± 2.8 μs, 8% faster than baseline
Use -p '/alterDelete-miss.Int/' to rerun this test only.
alterFDelete-miss
Int: FAIL (1.23s)
260 μs ± 3.0 μs, 7% faster than baseline
Use -p '/alterFDelete-miss.Int/' to rerun this test only.
14 out of 118 tests failed (137.69s)
Benchmarks for containers
and hashmap
were included by uncommenting this line:
I did try to make my machine pretty quiet for these runs. I don't know why these benchmarks are still so very noisy, but I note that most of these are on the slower end of our benchmark suite.
It also seems noteworthy that hardly any of the containers
and hashmap
benchmarks are included, apparently more than would be explained by their smaller share of the suite.
Maybe implementing #293 would help?!