Skip to content

[ExecuTorch][WebGPU] 2D-fold mul + permute dispatch (lift 65535 1D cap)#20651

Open
JulianCloudNTH wants to merge 1 commit into
gh/JulianCloudNTH/83/basefrom
gh/JulianCloudNTH/83/head
Open

[ExecuTorch][WebGPU] 2D-fold mul + permute dispatch (lift 65535 1D cap)#20651
JulianCloudNTH wants to merge 1 commit into
gh/JulianCloudNTH/83/basefrom
gh/JulianCloudNTH/83/head

Conversation

@JulianCloudNTH

@JulianCloudNTH JulianCloudNTH commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

Lift the 65535 workgroup-per-dim cap for mul and permute so they run at any numel.

mul.Tensor and permute still used compute_1d_workgroup_count, which throws once numel / wg_size > 65535 — hit by a realistic Llama-3.2-1B LoRA layer (mul over [2048, 8192] = 262k workgroups; permute of [2048, 2048] = 65536). add/sub/div/fill/sdpa already use the 2D fold; this brings mul + permute in line.

Key changes:

  • mul/BinaryOp.cpp, permute/Permute.cppcompute_1d_workgroup_countcompute_2d_workgroup_count (returns utils::WgCount); dispatch + resize hook now set both workgroup_count_x and workgroup_count_y.
  • binary_mul.wgsl, permute.wgslmain takes @builtin(num_workgroups); flat index gid.x + gid.y * (num_workgroups.x * wg_size) (regenerated *_wgsl.h).

Mirrors the landed add op fold (runtime/ops/add/{BinaryOp.cpp,binary_add.wgsl}).

Co-authored-with: Claude Code.

Differential Revision: D110149677

[ghstack-poisoned]
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2026
@pytorch-bot

pytorch-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20651

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit d29e08a with merge base 73c259e (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant