[Relax][ONNX] Fix get_converter selecting wrong impl when opset < minimum supported version#18911
Conversation
…imum supported version When no _impl_vN version exists for the given opset, the bisect-based index becomes 0; subtracting 1 gives -1, and Python negative indexing silently selects the latest (incompatible) implementation. For example, ReduceMean with opset=9 was mapped to _impl_v18 instead of raising an error, producing wrong output shapes. Fix by filtering to compatible versions first and raising NotImplementedError when none exist. Fixes apache#18698
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the ONNX frontend's Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly fixes an issue in get_converter where an incorrect operator implementation was selected for unsupported opsets, potentially leading to silent failures. The new logic is clearer and more robust. I've added two comments: one to handle a potential edge case in get_converter that could lead to an unhandled exception, and another to refactor duplicated code in the new tests for better maintainability.
| if not compatible: | ||
| raise NotImplementedError( | ||
| f"{cls.__name__} is not supported for opset {opset}. " | ||
| f"Minimum supported opset: {min(versions)}" | ||
| ) |
There was a problem hiding this comment.
The logic to raise an error for unsupported opsets is a great improvement. However, there's an edge case that could lead to an unhandled exception. If an operator has no _impl_vN methods, the versions list will be empty. In this scenario, min(versions) will raise a ValueError. It's better to handle this case explicitly to provide a more informative error message and prevent the crash.
| if not compatible: | |
| raise NotImplementedError( | |
| f"{cls.__name__} is not supported for opset {opset}. " | |
| f"Minimum supported opset: {min(versions)}" | |
| ) | |
| if not compatible: | |
| if not versions: | |
| raise NotImplementedError( | |
| f"{cls.__name__} is not supported for opset {opset}, as no implementations are available." | |
| ) | |
| raise NotImplementedError( | |
| f"{cls.__name__} is not supported for opset {opset}. " | |
| f"Minimum supported opset: {min(versions)}" | |
| ) |
| def test_reduce_mean(): | ||
| # opset 13: axes passed as attribute | ||
| node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1) | ||
| graph = helper.make_graph( | ||
| [node], | ||
| "reduce_mean_test", | ||
| inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])], | ||
| outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])], | ||
| ) | ||
| model = helper.make_model(graph, producer_name="reduce_mean_test") | ||
| check_correctness(model, opset=13) | ||
|
|
||
|
|
||
| def test_reduce_mean_unsupported_opset(): | ||
| # Regression test for https://github.com/apache/tvm/issues/18698. | ||
| # When opset < minimum available impl version, get_converter previously | ||
| # wrapped to -1 and silently picked the newest impl instead of raising. | ||
| node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1) | ||
| graph = helper.make_graph( | ||
| [node], | ||
| "reduce_mean_test", | ||
| inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])], | ||
| outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])], | ||
| ) | ||
| model = helper.make_model(graph, producer_name="reduce_mean_test") | ||
| model.opset_import[0].version = 9 | ||
| with pytest.raises(NotImplementedError, match="not supported for opset 9"): | ||
| from_onnx(model, opset=9, keep_params_in_input=True) | ||
|
|
||
|
|
There was a problem hiding this comment.
The new tests test_reduce_mean and test_reduce_mean_unsupported_opset are great for ensuring correctness and preventing regressions. However, they share a significant amount of code for model creation. To improve maintainability and reduce duplication, you could extract the model creation logic into a helper function.
def _get_reduce_mean_model():
"""Helper to create a ReduceMean ONNX model for testing."""
node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1)
graph = helper.make_graph(
[node],
"reduce_mean_test",
inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])],
outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])],
)
return helper.make_model(graph, producer_name="reduce_mean_test")
def test_reduce_mean():
# opset 13: axes passed as attribute
model = _get_reduce_mean_model()
check_correctness(model, opset=13)
def test_reduce_mean_unsupported_opset():
# Regression test for https://github.com/apache/tvm/issues/18698.
# When opset < minimum available impl version, get_converter previously
# wrapped to -1 and silently picked the newest impl instead of raising.
model = _get_reduce_mean_model()
model.opset_import[0].version = 9
with pytest.raises(NotImplementedError, match="not supported for opset 9"):
from_onnx(model, opset=9, keep_params_in_input=True)
What does this PR do?
Fixes #18698
When a model's opset is lower than all available
_impl_vNversions for an operator, the bisect-based index inget_converterbecomes 0; subtracting 1 gives -1, and Python's negative indexing silently selects the latest (incompatible) implementation.For example,
ReduceMeanwithopset=9was mapped to_impl_v18(which expectsaxesas an input tensor) instead of raising an error, producing wrong output shapes with no indication of failure.Fix
Replace the bisect-based logic with an explicit filter of compatible versions (
<= opset), pick the maximum, and raiseNotImplementedErrorwith a clear message when no compatible version exists.Tests
test_reduce_mean: verifies correct behavior for opset=13 (axes as attribute)test_reduce_mean_unsupported_opset: regression test — opset=9 now raisesNotImplementedErrorinstead of silently producing wrong results