[ExecuTorch][WebGPU] Dynamic tensor-shape resize engine core#20574
[ExecuTorch][WebGPU] Dynamic tensor-shape resize engine core#20574JulianCloudNTH wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20574
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit aa08145 with merge base 55a71e6 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
@claude review and check for any areas or opportunities for modularization |
|
Claude finished @JulianCloudNTH's task in 1m 58s —— View job Code review: WebGPU dynamic resize engine core
Overall this is a clean, well-commented port of Vulkan's resize model. The "static graph stays byte-identical" invariant holds: Modularization opportunities1.
inline uint64_t numel_of(const std::vector<int64_t>& dims) { ... }But there are zero callers of
Either wire the new util into those sites or drop it — as-is it's dead code plus two surviving duplicates. Replacing the per-op locals with 2. Extract 3. A int dim = s.dim < 0 ? s.dim + static_cast<int>(d.size()) : s.dim;
if (dim < 0 || dim >= static_cast<int>(d.size())) { throw ...; }at Correctness / robustness4. 5. Per-call heap allocations on the hot path. 6. 7. No op registers a tensor resize hook yet. CI noteThe reported failure is |
Stack from ghstack (oldest at bottom):
The WebGPU backend baked static tensor shapes at build time, so a dynamic
.pteneeded a separate graph for each shape (prefill vs. decode). This adds a tensor-shape resize engine mirroring Vulkan: tensors carry livecur_dims≤ max, inputs resize per call, and a bounded-fixpoint propagates tensor-level resize hooks.Key changes:
WebGPUTensor: addcur_dims/cur_nbytes(live sizes ≤ max allocation), initialized to max at buildWebGPUGraph:resize_input/set_cur_dimsvalidate live dims fit max,propagate_resizeruns tensor hooks for dirty shapesupdate_symints_from_inputsreads livecur_dims; addssym_size.intdim source pathcopy_inputsuploads only live bytes;WebGPUBackend::executeshrinks inputs and resizes outputs to live shapesStatic graphs stay byte-identical:
cur == maxforever, no hooks fire, no reallocations.Differential Revision: D109906091