Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenVINO GPU] OpenVINO EP shouldn't override the "ACCURACY" precision to "FP32" #23895

Open
mingmingtasd opened this issue Mar 5, 2025 · 2 comments
Labels
ep:OpenVINO issues related to OpenVINO execution provider

Comments

@mingmingtasd
Copy link
Contributor

mingmingtasd commented Mar 5, 2025

Describe the issue

There is a regression caused by https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/openvino/openvino_provider_factory.cc#L109 from this PR: #23553. Now users can't set "ACCURACY" precision for OV GPU device, it will be overridden as "FP32" which is not as expected. "ACCURACY" precision is really useful, see details: https://docs.openvino.ai/2025/openvino-workflow/running-inference/optimize-inference/precision-control.html#execution-mode, it's not equal to "FP32".

To reproduce

You can set "precision" as "ACCURACY" but it will behavior as "FP32", like https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#cc-api-20:

std::unordered_map<std::string, std::string> options;
options["device_type"] = "GPU";
options["precision"] = "ACCURACY";
session_options.AppendExecutionProvider("OpenVINO", options);

Urgency

Very urgent, it impacts my performance testing

Platform

Windows

OS Version

win11

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

30c6825

ONNX Runtime API

C++

Architecture

X64

Execution Provider

OpenVINO

Execution Provider Library Version

ov 2024.6

@github-actions github-actions bot added the ep:OpenVINO issues related to OpenVINO execution provider label Mar 5, 2025
@mingmingtasd
Copy link
Contributor Author

By the way, is that possible to expose the ov::hint::execution_mode directly and let it override the precision?

@huningxin
Copy link

For WebNN use case, we need OpenVINO EP to select execution precision on GPU device by respecting the model data types. Say if an operator in the model operates on FP32 tensor, execute it in FP32 precision, if an operator operates on FP16 tensor, execute it in FP16 precision.

Previously , we set OpenVINO EP precision option to ACCURACY: https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html#summary-of-options

Internally, regarding to ACCURACY option, BasicBackend::PopulateConfigValue() sets OpenVINO ov::hint::inference_precision to ov::element::undefined and set execution_mode to ov::hint::execution_mode::ExecutionMode::ACCURACY that works for us.

https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/openvino/backends/basic_backend.cc#L170

However, the recent commit: a6ea57b, changed changes this behavior and sets to highest precision on a device, for GPU, it is FP32:

https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/providers/openvino/openvino_provider_factory.cc#L109

This causes WebNN FP16 model execution is slower than before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:OpenVINO issues related to OpenVINO execution provider
Projects
None yet
Development

No branches or pull requests

2 participants