Commit 65495ee

authored and

facebook-github-bot

committed

Reviving our benchmarks (#249)

Summary: Pull Request resolved: #249 Our benchmarking tool was broken for some time. Specifically it used the same input tensor for all variants of a model it ran which ran into problems due to pytorch/pytorch#57985. This PR creates a new input tensor for each of the variants, so the script works again. To run simply ```bash cd multipy python -m pip install -e # run benchmarks with resnet (can replace with any model as long as there is a x_jit file in the same folder) ./multipy/runtime/build/deploy_benchmark <num threads> <cuda> <jit> <path to model> ``` The output of `./multipy/runtime/build/deploy_benchmark 8 none jit multipy/runtime/example/generated/resnet` on my machine is ```bash benchmark, strategy, n_threads, work_items_completed, work_items_per_second, p25_latency, p50_latency, p95_latency, device ... multipy/runtime/example/generated/resnet, one_python, 1, 20, 3.878, 0.0703562, 0.154832, 0.846681, cpu multipy/runtime/example/generated/resnet, multi_python, 1, 16, 2.85246, 0.0576035, 0.263045, 0.919259, cpu multipy/runtime/example/generated/resnet, jit, 1, 7, 1.24033, 0.766784, 0.821864, 0.860599, cpu multipy/runtime/example/generated/resnet, one_python, 1, 7, 1.29667, 0.731622, 0.767437, 0.829395, cpu multipy/runtime/example/generated/resnet, multi_python, 2, 110, 20.8865, 0.0777501, 0.0869799, 0.135762, cpu multipy/runtime/example/generated/resnet, jit, 2, 114, 20.0387, 0.0729377, 0.0815373, 0.12296, cpu multipy/runtime/example/generated/resnet, one_python, 1, 7, 1.30409, 0.749057, 0.774374, 0.798906, cpu multipy/runtime/example/generated/resnet, multi_python, 4, 182, 34.8009, 0.100641, 0.108486, 0.134744, cpu multipy/runtime/example/generated/resnet, jit, 4, 162, 30.932, 0.106312, 0.121582, 0.168822, cpu multipy/runtime/example/generated/resnet, one_python, 1, 6, 1.11104, 0.851673, 0.896697, 0.972198, cpu multipy/runtime/example/generated/resnet, multi_python, 8, 192, 36.8651, 0.18489, 0.204053, 0.280764, cpu multipy/runtime/example/generated/resnet, jit, 8, 200, 38.2004, 0.18272, 0.199734, 0.249852, cpu ``` From here I plan to 1. Expand the benchmark suite by adding more models. 2. Add torchdynamo/torchinductor benchmarks 3. Add mixed benchmarks (ie. torch::deploy + torchdynamo/torchinductor) 4. Make this pluggable so users can quickly analyze performance changes with `torch::deploy`. 5. Add some for of support for a perf analyzer. Some auxiliary next steps include 1. Add unit tests for multithreading to make sure it doesn't break. We can just run the benchmarks, and that should suffice. 6. Create an example for multithreading as that's a primary use case for `torch::deploy` Test Plan: Imported from OSS Reviewed By: d4l3k Differential Revision: D40965927 Pulled By: PaliC fbshipit-source-id: 0962e49faffc888452f8969202628cacf775ba63

1 parent a1aebd1 commit 65495eeCopy full SHA for 65495ee

2 files changed

+30

-378

lines changed

multipy/runtime
- benchmark.cpp
- example
  - benchmark.cpp

2 files changed

+30

-378

lines changed

`‎multipy/runtime/benchmark.cpp`

This file was deleted.

Comments

(0)