[ExecuTorch][WebGPU] Dynamic resize hook for view_copy#20579
[ExecuTorch][WebGPU] Dynamic resize hook for view_copy#20579JulianCloudNTH wants to merge 2 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20579
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review and check for any areas or opportunities for modularization |
|
Claude finished @JulianCloudNTH's task in 1m 47s —— View job Review: Dynamic resize hook for
|
Stack from ghstack (oldest at bottom):
Make
view_copytrack the live sequence length under dynamic shapes.Problem:
view_copylowers to a flat DMA buffer copy (add_buffer_copy) sized at the build-time max shape. With one dynamic graph serving any seq-len S (prefill S=K, decode S=1), the copy moved the full max-S byte count and the output kept its max dims, so a downstream consumer read a live shape that was too large.Solution: register a tensor resize hook on the input so the copy follows the live input numel (a view preserves numel).
copy_nbytesand the output dims are fixed at the serialized max.cur_dims(in), scales the single dynamic output dim to preserve numel, sets the outputcur_dims, and rewrites the Copy dispatch'scopy_nbytes.Implementation:
Kind::Copy); the hook only rewritescopy_nbytesviadispatch_at, no new kernel.cur_dimsso the resize cascade reaches consumers.view_buffercontiguous fast path; numel-preserving like the other dynamic-shape op hooks.Constraints: inert on a static graph (
cur_dims == dims), so byte-identical to the prior behavior; fp32-only and numel-preserving invariants unchanged.Co-authored-with: Claude Code.
Differential Revision: D109906098