Skip to content

Commit 07aaa43

Browse files
ivbergHectorSVC
andauthored
Add some extra docs & checks after testing QNN Context Cache support with ORT team (#324)
Co-authored-by: Hector Li <[email protected]>
1 parent e513036 commit 07aaa43

File tree

2 files changed

+41
-15
lines changed

2 files changed

+41
-15
lines changed

c_cxx/QNN_EP/mobilenetv2_classification/README.md

+11-5
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,12 @@
33
- The sample uses the QNN EP to:
44
- a. run the float32 model on Qnn CPU banckend.
55
- b. run the QDQ model on HTP backend with qnn_context_cache_enable=1, and generates the Onnx model which has QNN context binary embedded.
6+
a Model inputs & outputs will be float32
67
- c. run the QNN context binary model generated from ONNX Runtime (previous step) on HTP backend, to improve the model initialization time and reduce memory overhead.
78
- d. run the QNN context binary model generated from QNN tool chain on HTP backend, to support models generated from native QNN tool chain.
9+
a Models inputs & outputs will be quantized INT8 and kitten_input_nhwc.raw input per the qnn-onnx-converter.exe options used
10+
a E.g. qnn-onnx-converter.exe --input_dtype uint8 --input_layout NHWC
11+
a See QNN doc - docs/QNN/general/tools.html#qnn-onnx-converter
812
- The sample downloads the mobilenetv2 model from Onnx model zoo, and use mobilenetv2_helper.py to quantize the float32 model to QDQ model which is required for HTP backend
913
- The sample is targeted to run on QC ARM64 device.
1014
- There are 2 ways to improve the session creation time by using of QNN context binary:
@@ -41,19 +45,21 @@
4145
- Windows 11
4246
- Visual Studio 2022
4347
- Python (needed to quantize model)
44-
- OnnxRuntime ARM Build with initial QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.15+
48+
- Qualcomm AI Engine Direct SDK (QNN SDK) from https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct
49+
- Last known working QNN version: 2.14.1.230828
50+
- OnnxRuntime ARM Build with QNN support such as ONNX Runtime (ORT) Microsoft.ML.OnnxRuntime.QNN 1.17+
4551
- Download from https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.QNN and unzip
4652
- ORT Drop DOES NOT INCLUDE QNN so QNN binaries must be copied from QC SDK. E.g
47-
- robocopy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native
48-
- copy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native
49-
- copy C:\Qualcomm\AIStack\QNN\2.15.1.230926\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native
53+
- robocopy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\aarch64-windows-msvc %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
54+
- copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v68\unsigned\libQnnHtpV68Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
55+
- copy C:\Qualcomm\AIStack\QNN\2.14.1.230828\lib\hexagon-v73\unsigned\libQnnHtpV73Skel.so %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
5056
- (OR) Compiled from onnxruntime source with QNN support - https://onnxruntime.ai/docs/build/eps.html#qnn
5157

5258
## How to run the application
5359
(Windows11) Run ```run_qnn_ep_sample.bat``` with path to onnxruntime root directory (for includes) and path to bin directory
5460
```
5561
run_qnn_ep_sample.bat PATH_TO_ORT_ROOT_WITH_INCLUDE_FOLDER PATH_TO_ORT_BINARIES_WITH_QNN
56-
Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.16.0\runtimes\win-arm64\native
62+
Example (Drop): run_qnn_ep_sample.bat %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\build\native %USERPROFILE%\Downloads\microsoft.ml.onnxruntime.qnn.1.17.0\runtimes\win-arm64\native
5763
Example (Src): run_qnn_ep_sample.bat C:\src\onnxruntime C:\src\onnxruntime\build\Windows\RelWithDebInfo\RelWithDebInfo
5864
```
5965

c_cxx/QNN_EP/mobilenetv2_classification/run_qnn_ep_sample.bat

+30-10
Original file line numberDiff line numberDiff line change
@@ -38,18 +38,31 @@ REM Download label file
3838
IF NOT EXIST %LABEL_FILE% (
3939
powershell -Command "Invoke-WebRequest %LABEL_FILE_URL% -Outfile %LABEL_FILE%" )
4040

41-
REM Generate QDQ model, fixed shape float32 model, fixed shape QDQ model, kitten_input.raw
41+
REM Generate QDQ model, fixed shape float32 model, fixed shape QDQ model, kitten_input.raw, and kitten_input_nhwc.raw
4242
REM If there are issues installing python pkgs due to long paths see https://github.com/onnx/onnx/issues/5256
4343
IF NOT EXIST mobilenetv2-12_shape.onnx (
44-
@ECHO ON
45-
pip install opencv-python
46-
pip install pillow
47-
pip install onnx
48-
pip install onnxruntime
49-
python mobilenetv2_helper.py
50-
@ECHO OFF
44+
GOTO INSTALL_PYTHON_DEPS_AND_RUN_HELPER
45+
) ELSE IF NOT EXIST kitten_input.raw (
46+
GOTO INSTALL_PYTHON_DEPS_AND_RUN_HELPER
47+
) ELSE IF NOT EXIST kitten_input_nhwc.raw (
48+
GOTO INSTALL_PYTHON_DEPS_AND_RUN_HELPER
49+
) ELSE (
50+
GOTO END_PYTHON
5151
)
5252

53+
:INSTALL_PYTHON_DEPS_AND_RUN_HELPER
54+
@ECHO ON
55+
pip install opencv-python
56+
pip install pillow
57+
pip install onnx
58+
pip install onnxruntime
59+
python mobilenetv2_helper.py
60+
@ECHO OFF
61+
GOTO END_PYTHON
62+
63+
:END_PYTHON
64+
65+
5366
REM Download add_trans_cast.py file
5467
set QNN_CTX_ONNX_GEN_SCRIPT_URL="https://raw.githubusercontent.com/microsoft/onnxruntime/main/onnxruntime/python/tools/qnn/gen_qnn_ctx_onnx_model.py"
5568
set QNN_CTX_ONNX_GEN_SCRIPT="gen_qnn_ctx_onnx_model.py"
@@ -107,8 +120,15 @@ REM load mobilenetv2-12_quant_shape.onnx with QNN HTP backend, generate mobilene
107120
REM This does not has to be run on real device with HTP, it can be done on x64 platform also, since it supports offline generation
108121
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx kitten_input.raw --gen_ctx
109122

110-
REM run mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx with QNN HTP backend
111-
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx kitten_input.raw
123+
REM TODO Check for mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx
124+
125+
IF EXIST mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx (
126+
REM run mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx with QNN HTP backend (generted from previous step)
127+
qnn_ep_sample.exe --htp mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx kitten_input.raw
128+
) ELSE (
129+
ECHO mobilenetv2-12_quant_shape.onnx_qnn_ctx.onnx does not exist. It didn't get generated in previous step. Are you using ONNX 1.17+?
130+
)
131+
112132

113133
REM run mobilenetv2-12_net_qnn_ctx.onnx (generated from native QNN) with QNN HTP backend
114134
qnn_ep_sample.exe --qnn mobilenetv2-12_net_qnn_ctx.onnx kitten_input_nhwc.raw

0 commit comments

Comments
 (0)