perf(ecmascript): shrink Call/New args from Vec to Box<[T]>#93111
Closed
mmastrac wants to merge 1 commit intommastrac/jsvalue-perf-experimentfrom
Closed
perf(ecmascript): shrink Call/New args from Vec to Box<[T]>#93111mmastrac wants to merge 1 commit intommastrac/jsvalue-perf-experimentfrom
mmastrac wants to merge 1 commit intommastrac/jsvalue-perf-experimentfrom
Conversation
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
2 tasks
Merging this PR will degrade performance by 4.73%
Performance Changes
Comparing Footnotes
|
Swap `Vec<JsValue>` (24 B) for `Box<[JsValue]>` (16 B) on `JsValue::Call` and `JsValue::New` args. Analysis never pushes to args after construction — the `Vec`'s capacity field is dead weight. A few consumer sites that want an owned `Vec` convert via `into_vec()` (O(1), reuses the backing allocation). Shrinks `Call`/`New` variant payloads from 40 B to 32 B. `JsValue`'s overall enum size stays at 40 B (driven by `Unknown`, unchanged). Measured net perf win of 2–6 % on the `references` criterion bench: `Vec::capacity` is checked / decremented across the analyzer's hot paths, and dropping those accesses pays for the O(1) conversion cost. Snapshots stable. All 351 lib tests pass.
3e73dca to
e45dc0f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Stacked on #93106. Swaps
Vec<JsValue>forBox<[JsValue]>on the args field ofJsValue::CallandJsValue::New. Shrinks each variant's payload from 40 B to 32 B (16 B Box vs 24 B Vec) while the overall enum stays at 40 B (driven byUnknown).Why
Box<[T]>Analysis never pushes to args after construction. The
Vec's capacity field is dead weight — 8 B wasted on everyCall/New.Box<[JsValue]>stores only(ptr, len).into_boxed_slice()at construction is a no-op when capacity already equals length (true forsize_hint-exact iterators likecall_expr.args.iter().map(...).collect()), and consumer sites that need an ownedVecconvert viainto_vec()— an O(1) pointer-level conversion that reuses the backing allocation.Measured perf (criterion, references bench)
Net perf win.
Vec::capacityisn't checked in the hot paths any more, and element-wise access / iteration is identical.Investigation note — why this PR doesn't also shrink JsValue to 32 B
An earlier revision of this PR also changed
Unknown.reasonfromCow<'static, str>(24 B) to&'static str(16 B), which would have broughtUnknownto 32 B and thereforeJsValueto 32 B overall. But that combined change regressed perf by 2–5% even though each change was individually perf-neutral or perf-positive. Ran isolation experiments:The regression appears to be triggered specifically by JsValue shrinking from 40 B to 32 B — likely a subtle allocator or codegen effect that I couldn't pinpoint. Ruled out TurboMalloc (same with system allocator). Boxing the whole
Unknownpayload as an alternative path to 32 B was even worse (+10–16 %). Filed for follow-up; shipping the unambiguous win for now.Test plan
cargo test -p turbopack-ecmascript --lib— 351 passing