-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support WebNN EP #15698
Support WebNN EP #15698
Conversation
This PR enables WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by WebNN API, which is implemented in C++ and uses Emscripten Embind API. Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing discussion in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web.
@fdwr, @guschmue, PTAL, thanks! Any other reviewer should I invite? cc/ @huningxin @zesongw |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline |
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 2 pipeline(s). |
Azure Pipelines successfully started running 5 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
@Honry, thanks for the PR! Going to look at it today. |
ci complains about a minor lint issue. |
@Honry, got it to build and I can run mobilenet-v2 on canary - awesome! |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline |
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline |
Azure Pipelines successfully started running 2 pipeline(s). |
Azure Pipelines successfully started running 5 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 9 pipeline(s). |
Azure Pipelines successfully started running 7 pipeline(s). |
/azp run Post Merge |
No pipelines are associated with this pull request. |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
Azure Pipelines successfully started running 7 pipeline(s). |
Azure Pipelines successfully started running 9 pipeline(s). |
# Avoid unboundTypeError for WebNN EP since unbound type names are illegal with RTTI disabled | ||
# in Embind API, relevant issue: https://github.com/emscripten-core/emscripten/issues/16911 | ||
if(NOT onnxruntime_USE_WEBNN) | ||
add_compile_options("$<$<COMPILE_LANGUAGE:CXX>:-fno-rtti>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wonder what the size implications for our release builds are
@@ -64,6 +64,29 @@ const setExecutionProviders = | |||
case 'xnnpack': | |||
epName = 'XNNPACK'; | |||
break; | |||
case 'webnn': | |||
epName = 'WEBNN'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should throw if proxy is not set; might save devs some time to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size increase:
ort-wasm-simd.wasm / no webnn: 8870924
ort-wasm-simd.wasm / with webnn: 9535743
~650KB, should be manageable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should throw if proxy is not set; might save devs some time to debug.
Indeed, I will set proxy to true once webnn backend is used.
@@ -86,8 +86,8 @@ ORT_API_STATUS_IMPL(OrtApis::SessionOptionsAppendExecutionProvider, | |||
#endif | |||
} else if (strcmp(provider_name, "WEBNN") == 0) { | |||
#if defined(USE_WEBNN) | |||
std::string deviceType = options->value.config_options.GetConfigOrDefault("deviceType", "2"); | |||
std::string powerPreference = options->value.config_options.GetConfigOrDefault("powerPreference", "0"); | |||
std::string deviceType = options->value.config_options.GetConfigOrDefault("deviceType", "cpu"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have 2 different places in the code to set default value of the configs. this may cause potential inconsistency.
is there a way to put default value to only one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other place is Line 29 in webnn_provider_factory.cc , in case you don't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! These should be duplicated. I will remove one of them.
return false; | ||
|
||
const auto input_size = input_shape.size(); | ||
if (input_size > 4 || input_size == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I don't see anything in the spec about concat (https://www.w3.org/TR/webnn/#api-mlgraphbuilder-concat) being limited to 4D. Concat should support an arbitrary number of dimensions because the backend implementation can always flatten/reshape adjacent dimensions. DML supports up to 8D directly, and XNNPack can always just flatten dimensions. I know this because that's what we did for many DML ops, which were once limited to only 4D in the DML API, reshaping adjacent dimensions so that a shape like [2,3,4,5,6] with concat axis = 2 became [6,4,30] with concat axis = 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Thanks @fdwr! I will remove this restriction.
} | ||
|
||
{ // Gemm | ||
CreateGemmOpBuilder("Gemm", op_registrations); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CreateGemmOpBuilder("MatMul", op_registrations);
too? I see MatMul checked here: https://github.com/microsoft/onnxruntime/pull/15698/files#diff-80ffe78c84984a47483ee44069d532504c82ccc174aaa5a36dfa8900c3530662R40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I forgot to remove MatMul check, as which hasn't been implemented in Chromium.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh. Too bad because I have MatMul implemented in my fork :b. https://github.com/fdwr/chromium-src-webnn-dml/pull/1/files#diff-48bafbfc616dd59a5134cc609ccd6936f00b36e8ad814e950f8f235a9980bf21R2279
/azp run Linux CPU CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux OpenVINO CI Pipeline, Linux QNN CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows ARM64 QNN CI Pipeline |
Azure Pipelines successfully started running 8 pipeline(s). |
/azp run |
You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list. |
/azp run Windows CPU CI Pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed, |
Azure Pipelines successfully started running 6 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although my comments remain, I'm conditionally approving this to unblock the demo and Guenther, so long as they are addressed in a subsequent PR (MatMul and Concat being 4D). Thanks much for this substantial work and contribution.
Awesome PR @Honry ! |
@guschmue, @fdwr, @fs-eire, thanks you so much for your review!
@fdwr, Concat had been fixed at this commit, and MatMul will not be too late. :) |
no worry, can always send a new PR with fixes and new ops. |
**Description**: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout **NHWC** for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](webmachinelearning/webnn#324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. **Motivation and Context**: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. **WebNN API Implementation Status in Chromium**: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - **CPU device**: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - **GPU device**: based on DML, implementation is ongoing. **Open**: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!
**Description**: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout **NHWC** for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](webmachinelearning/webnn#324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. **Motivation and Context**: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. **WebNN API Implementation Status in Chromium**: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - **CPU device**: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - **GPU device**: based on DML, implementation is ongoing. **Open**: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!
Description:
This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by WebNN API, which is implemented in C++ and uses Emscripten Embind API. Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing discussion in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web.
Motivation and Context:
Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration.
WebNN API Implementation Status in Chromium:
Open: