Arm backend: Relax model test tolerances by zingo · Pull Request #20624 · pytorch/executorch

zingo · 2026-06-30T14:31:38Z

Increase the DL3 VGF quant atol to cover the observed max absolute error while keeping the existing rtol.

Relax the Conv3D A8W4 Frobenius threshold to account for borderline quantization noise in small-output cases where cosine similarity remains high.

cc @digantdesai @freddan80 @per @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

Increase the DL3 VGF quant atol to cover the observed max absolute error while keeping the existing rtol. Relax the Conv3D A8W4 Frobenius threshold to account for borderline quantization noise in small-output cases where cosine similarity remains high. Signed-off-by: Zingo Andersen <Zingo.Andersen@arm.com> Change-Id: I1e5039d3755f0461b13a60e2cc8a43032f9b9715

pytorch-bot · 2026-06-30T14:31:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20624

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 7f006e8 with merge base 0567b0a ():

NEW FAILURE - The following job has failed:

pull / test-multimodal-linux (gemma3-4b) / linux-job (gh)
RuntimeError: Command docker exec -t a09e349e07e64a6281666f9a29a80eba239163529da4f750610a482c6647b701 /exec failed with exit code 139

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Pull request overview

This PR adjusts Arm backend test tolerances to reduce spurious failures from quantization noise in specific borderline cases (DL3 VGF quant model and Conv3D A8W4).

Changes:

Relax Conv3D INT A8W4 frobenius_threshold from 0.4 to 0.5.
Increase DL3 VGF quant output comparison atol from 0.1 to 0.15 (keeping rtol=0.1).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
backends/arm/test/ops/test_conv3d.py	Loosens Frobenius threshold for Conv3D INT A8W4 tolerance checks.
backends/arm/test/models/test_dl3_arm.py	Loosens absolute tolerance for DL3 VGF quant output comparisons.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

        tosa_extensions=["int4"],
        qtol=1,
-        frobenius_threshold=0.4,
+        frobenius_threshold=0.5,
    )


    pipeline.change_args(
-        "run_method_and_compare_outputs", rtol=0.1, atol=0.1
+        "run_method_and_compare_outputs", rtol=0.1, atol=0.15
    )  # TODO: MLETORCH-1036 decrease tolerance


Copilot AI review requested due to automatic review settings June 30, 2026 14:31

zingo requested a review from digantdesai as a code owner June 30, 2026 14:31

zingo added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes labels Jun 30, 2026

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2026

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Jun 30, 2026

Copilot started reviewing on behalf of zingo June 30, 2026 14:32 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

rascani requested a review from SS-JIA June 30, 2026 15:18

oscarandersson8218 approved these changes Jul 1, 2026

View reviewed changes

zingo merged commit b9804fb into pytorch:main Jul 1, 2026
503 of 509 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arm backend: Relax model test tolerances#20624

Arm backend: Relax model test tolerances#20624
zingo merged 1 commit into
pytorch:mainfrom
zingo:Arm-backend-Relax-model-test-tolerances

zingo commented Jun 30, 2026 •

edited by pytorch-bot Bot

Loading

Uh oh!

pytorch-bot Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

zingo commented Jun 30, 2026 • edited by pytorch-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20624

❌ 1 New Failure

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zingo commented Jun 30, 2026 •

edited by pytorch-bot Bot

Loading

pytorch-bot Bot commented Jun 30, 2026 •

edited

Loading