|
3 | 3 | - The sample uses the QNN EP to:
|
4 | 4 | - a. run the float32 model on Qnn CPU banckend.
|
5 | 5 | - b. run the QDQ model on HTP backend with qnn_context_cache_enable=1, and generates the Onnx model which has QNN context binary embedded.
|
| 6 | + a Model inputs & outputs will be float32 |
6 | 7 | - c. run the QNN context binary model generated from ONNX Runtime (previous step) on HTP backend, to improve the model initialization time and reduce memory overhead.
|
7 | 8 | - d. run the QNN context binary model generated from QNN tool chain on HTP backend, to support models generated from native QNN tool chain.
|
| 9 | + a Models inputs & outputs will be quantized INT8 and kitten_input_nhwc.raw input per the qnn-onnx-converter.exe options used |
| 10 | + a E.g. qnn-onnx-converter.exe --input_dtype uint8 --input_layout NHWC |
| 11 | + a See QNN doc - docs/QNN/general/tools.html#qnn-onnx-converter |
8 | 12 | - The sample downloads the mobilenetv2 model from Onnx model zoo, and use mobilenetv2_helper.py to quantize the float32 model to QDQ model which is required for HTP backend
|
9 | 13 | - The sample is targeted to run on QC ARM64 device.
|
10 | 14 | - There are 2 ways to improve the session creation time by using of QNN context binary:
|
|
41 | 45 | - Windows 11
|
42 | 46 | - Visual Studio 2022
|
43 | 47 | - Python (needed to quantize model)
|
44 |
| -- OnnxRuntime ARM Build with initial QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.15+ |
| 48 | +- Qualcomm AI Engine Direct SDK (QNN SDK) from https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct |
| 49 | + - Last known working QNN version: 2.14.1.230828 |
| 50 | +- OnnxRuntime ARM Build with QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.17+ |
45 | 51 | - Download from https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.QNN and unzip
|
46 | 52 | - ORT Drop DOES NOT INCLUDE QNN so QNN binaries must be copied from QC SDK. E.g
|
47 |
| - - robocopy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native |
48 |
| - - copy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native |
49 |
| - - copy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native |
| 53 | + - robocopy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native |
| 54 | + - copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native |
| 55 | + - copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native |
50 | 56 | - (OR) Compiled from onnxruntime source with QNN support - https://onnxruntime.ai/docs/build/eps.html#qnn
|
51 | 57 |
|
52 | 58 | ## How to run the application
|
53 | 59 | (Windows11) Run ```run_qnn_ep_sample.bat``` with path to onnxruntime root directory (for includes) and path to bin directory
|
54 | 60 | ```
|
55 | 61 | run_qnn_ep_sample.bat PATH_TO_ORT_ROOT_WITH_INCLUDE_FOLDER PATH_TO_ORT_BINARIES_WITH_QNN
|
56 |
| -Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native |
| 62 | +Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native |
57 | 63 | Example (Src): run_qnn_ep_sample.bat C:\src\onnxruntime C:\src\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo
|
58 | 64 | ```
|
59 | 65 |
|
|
0 commit comments