Skip to content

[Relax][ONNX] Fix get_converter selecting wrong impl when opset < minimum supported version#18911

Open
Aharrypotter wants to merge 1 commit intoapache:mainfrom
Aharrypotter:fix/onnx-get-converter-opset-18698
Open

[Relax][ONNX] Fix get_converter selecting wrong impl when opset < minimum supported version#18911
Aharrypotter wants to merge 1 commit intoapache:mainfrom
Aharrypotter:fix/onnx-get-converter-opset-18698

Conversation

@Aharrypotter
Copy link

What does this PR do?

Fixes #18698

When a model's opset is lower than all available _impl_vN versions for an operator, the bisect-based index in get_converter becomes 0; subtracting 1 gives -1, and Python's negative indexing silently selects the latest (incompatible) implementation.

For example, ReduceMean with opset=9 was mapped to _impl_v18 (which expects axes as an input tensor) instead of raising an error, producing wrong output shapes with no indication of failure.

Fix

Replace the bisect-based logic with an explicit filter of compatible versions (<= opset), pick the maximum, and raise NotImplementedError with a clear message when no compatible version exists.

# Before
versions = sorted(versions + [opset])
version = versions[max([i for i, v in enumerate(versions) if v == opset]) - 1]

# After
compatible = [v for v in versions if v <= opset]
if not compatible:
    raise NotImplementedError(
        f"{cls.__name__} is not supported for opset {opset}. "
        f"Minimum supported opset: {min(versions)}"
    )
version = max(compatible)

Tests

  • test_reduce_mean: verifies correct behavior for opset=13 (axes as attribute)
  • test_reduce_mean_unsupported_opset: regression test — opset=9 now raises NotImplementedError instead of silently producing wrong results

…imum supported version

When no _impl_vN version exists for the given opset, the bisect-based index
becomes 0; subtracting 1 gives -1, and Python negative indexing silently
selects the latest (incompatible) implementation. For example, ReduceMean
with opset=9 was mapped to _impl_v18 instead of raising an error, producing
wrong output shapes.

Fix by filtering to compatible versions first and raising NotImplementedError
when none exist.

Fixes apache#18698
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the ONNX frontend's get_converter function, which was failing to correctly identify the appropriate operator implementation for models with older opset versions. The fix replaces a flawed indexing mechanism with a robust filtering approach, ensuring that only compatible operator versions are considered and that an explicit error is raised when no such version is found. This prevents silent misbehavior and improves the reliability of ONNX model conversion.

Highlights

  • ONNX Converter Logic Fix: Resolved an issue in get_converter where an incorrect ONNX operator implementation was selected when the model's opset was lower than all available _impl_vN versions. The previous bisect-based logic could lead to silent selection of an incompatible, newer implementation.
  • Improved Error Handling: Implemented explicit error raising with NotImplementedError when no compatible ONNX operator version exists for a given opset, preventing silent failures and providing clearer diagnostics.
  • New Test Cases: Added new tests for ReduceMean to verify correct behavior for supported opsets and to confirm that unsupported opsets now correctly raise a NotImplementedError.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/tvm/relax/frontend/onnx/onnx_frontend.py
    • Updated the get_converter function to use explicit filtering for compatible ONNX opset versions.
    • Introduced NotImplementedError to be raised when no compatible opset version is found for an operator.
  • tests/python/relax/test_frontend_onnx.py
    • Added test_reduce_mean to verify correct ONNX ReduceMean conversion for opset 13.
    • Added test_reduce_mean_unsupported_opset to confirm NotImplementedError is raised for unsupported ONNX ReduceMean opset 9.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes an issue in get_converter where an incorrect operator implementation was selected for unsupported opsets, potentially leading to silent failures. The new logic is clearer and more robust. I've added two comments: one to handle a potential edge case in get_converter that could lead to an unhandled exception, and another to refactor duplicated code in the new tests for better maintainability.

Comment on lines +306 to +310
if not compatible:
raise NotImplementedError(
f"{cls.__name__} is not supported for opset {opset}. "
f"Minimum supported opset: {min(versions)}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to raise an error for unsupported opsets is a great improvement. However, there's an edge case that could lead to an unhandled exception. If an operator has no _impl_vN methods, the versions list will be empty. In this scenario, min(versions) will raise a ValueError. It's better to handle this case explicitly to provide a more informative error message and prevent the crash.

Suggested change
if not compatible:
raise NotImplementedError(
f"{cls.__name__} is not supported for opset {opset}. "
f"Minimum supported opset: {min(versions)}"
)
if not compatible:
if not versions:
raise NotImplementedError(
f"{cls.__name__} is not supported for opset {opset}, as no implementations are available."
)
raise NotImplementedError(
f"{cls.__name__} is not supported for opset {opset}. "
f"Minimum supported opset: {min(versions)}"
)

Comment on lines +3983 to +4012
def test_reduce_mean():
# opset 13: axes passed as attribute
node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1)
graph = helper.make_graph(
[node],
"reduce_mean_test",
inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])],
outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])],
)
model = helper.make_model(graph, producer_name="reduce_mean_test")
check_correctness(model, opset=13)


def test_reduce_mean_unsupported_opset():
# Regression test for https://github.com/apache/tvm/issues/18698.
# When opset < minimum available impl version, get_converter previously
# wrapped to -1 and silently picked the newest impl instead of raising.
node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1)
graph = helper.make_graph(
[node],
"reduce_mean_test",
inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])],
outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])],
)
model = helper.make_model(graph, producer_name="reduce_mean_test")
model.opset_import[0].version = 9
with pytest.raises(NotImplementedError, match="not supported for opset 9"):
from_onnx(model, opset=9, keep_params_in_input=True)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The new tests test_reduce_mean and test_reduce_mean_unsupported_opset are great for ensuring correctness and preventing regressions. However, they share a significant amount of code for model creation. To improve maintainability and reduce duplication, you could extract the model creation logic into a helper function.

def _get_reduce_mean_model():
    """Helper to create a ReduceMean ONNX model for testing."""
    node = helper.make_node("ReduceMean", inputs=["x"], outputs=["y"], axes=[2], keepdims=1)
    graph = helper.make_graph(
        [node],
        "reduce_mean_test",
        inputs=[helper.make_tensor_value_info("x", TensorProto.FLOAT, [1, 68, 4, 18])],
        outputs=[helper.make_tensor_value_info("y", TensorProto.FLOAT, [1, 68, 1, 18])],
    )
    return helper.make_model(graph, producer_name="reduce_mean_test")


def test_reduce_mean():
    # opset 13: axes passed as attribute
    model = _get_reduce_mean_model()
    check_correctness(model, opset=13)


def test_reduce_mean_unsupported_opset():
    # Regression test for https://github.com/apache/tvm/issues/18698.
    # When opset < minimum available impl version, get_converter previously
    # wrapped to -1 and silently picked the newest impl instead of raising.
    model = _get_reduce_mean_model()
    model.opset_import[0].version = 9
    with pytest.raises(NotImplementedError, match="not supported for opset 9"):
        from_onnx(model, opset=9, keep_params_in_input=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] The Onnx Frontend incorrectly mapping the Operator implementation version

1 participant