-
Notifications
You must be signed in to change notification settings - Fork 36
Commit 65495ee
Reviving our benchmarks (#249)
Summary:
Pull Request resolved: #249
Our benchmarking tool was broken for some time. Specifically it used the same input tensor for all variants of a model it ran which ran into problems due to pytorch/pytorch#57985. This PR creates a new input tensor for each of the variants, so the script works again. To run simply
```bash
cd multipy
python -m pip install -e
# run benchmarks with resnet (can replace with any model as long as there is a x_jit file in the same folder)
./multipy/runtime/build/deploy_benchmark <num threads> <cuda> <jit> <path to model>
```
The output of `./multipy/runtime/build/deploy_benchmark 8 none jit multipy/runtime/example/generated/resnet` on my machine is
```bash
benchmark, strategy, n_threads, work_items_completed, work_items_per_second, p25_latency, p50_latency, p95_latency, device
...
multipy/runtime/example/generated/resnet, one_python, 1, 20, 3.878, 0.0703562, 0.154832, 0.846681, cpu
multipy/runtime/example/generated/resnet, multi_python, 1, 16, 2.85246, 0.0576035, 0.263045, 0.919259, cpu
multipy/runtime/example/generated/resnet, jit, 1, 7, 1.24033, 0.766784, 0.821864, 0.860599, cpu
multipy/runtime/example/generated/resnet, one_python, 1, 7, 1.29667, 0.731622, 0.767437, 0.829395, cpu
multipy/runtime/example/generated/resnet, multi_python, 2, 110, 20.8865, 0.0777501, 0.0869799, 0.135762, cpu
multipy/runtime/example/generated/resnet, jit, 2, 114, 20.0387, 0.0729377, 0.0815373, 0.12296, cpu
multipy/runtime/example/generated/resnet, one_python, 1, 7, 1.30409, 0.749057, 0.774374, 0.798906, cpu
multipy/runtime/example/generated/resnet, multi_python, 4, 182, 34.8009, 0.100641, 0.108486, 0.134744, cpu
multipy/runtime/example/generated/resnet, jit, 4, 162, 30.932, 0.106312, 0.121582, 0.168822, cpu
multipy/runtime/example/generated/resnet, one_python, 1, 6, 1.11104, 0.851673, 0.896697, 0.972198, cpu
multipy/runtime/example/generated/resnet, multi_python, 8, 192, 36.8651, 0.18489, 0.204053, 0.280764, cpu
multipy/runtime/example/generated/resnet, jit, 8, 200, 38.2004, 0.18272, 0.199734, 0.249852, cpu
```
From here I plan to
1. Expand the benchmark suite by adding more models.
2. Add torchdynamo/torchinductor benchmarks
3. Add mixed benchmarks (ie. torch::deploy + torchdynamo/torchinductor)
4. Make this pluggable so users can quickly analyze performance changes with `torch::deploy`.
5. Add some for of support for a perf analyzer.
Some auxiliary next steps include
1. Add unit tests for multithreading to make sure it doesn't break. We can just run the benchmarks, and that should suffice.
6. Create an example for multithreading as that's a primary use case for `torch::deploy`
Test Plan: Imported from OSS
Reviewed By: d4l3k
Differential Revision: D40965927
Pulled By: PaliC
fbshipit-source-id: 0962e49faffc888452f8969202628cacf775ba631 parent a1aebd1 commit 65495eeCopy full SHA for 65495ee
File tree
Expand file treeCollapse file tree
2 files changed
+30
-378
lines changedFilter options
- multipy/runtime
- example
Expand file treeCollapse file tree
2 files changed
+30
-378
lines changedmultipy/runtime/benchmark.cpp
Copy file name to clipboardExpand all lines: multipy/runtime/benchmark.cpp-342Lines changed: 0 additions & 342 deletions
This file was deleted.
0 commit comments