Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Windows/x64: Half and FrozenDictionary Regressions on 1/28/2025 11:31:49 PM +00:00 #112135

Open
performanceautofiler bot opened this issue Feb 4, 2025 · 4 comments
Assignees
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@performanceautofiler
Copy link

Run Information

Name Value
Architecture x64
OS Windows 10.0.22621
Queue TigerWindows
Baseline d20c35e05c16dd2ec9cddd4259738cc02228d425
Compare d9b75154ad360154b863ee01b72fbef94de261b0
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Tests.Perf_Half

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
1.01 ns 2.25 ns 2.24 0.04 False
1.01 ns 2.25 ns 2.24 0.04 False

graph
graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Tests.Perf_Half*'

System.Tests.Perf_Half.HalfToSingle(value: 12344)

ETL Files

Histogram

JIT Disasms

System.Tests.Perf_Half.HalfToSingle(value: NaN)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository


Run Information

Name Value
Architecture x64
OS Windows 10.0.22621
Queue TigerWindows
Baseline d20c35e05c16dd2ec9cddd4259738cc02228d425
Compare d9b75154ad360154b863ee01b72fbef94de261b0
Diff Diff
Configs CompilationMode:tiered, RunKind:micro

Regressions in System.Collections.Perf_Frozen<Int16>

Benchmark Baseline Test Test/Base Test Quality Edge Detector Baseline IR Compare IR IR Ratio
50.64 ns 59.55 ns 1.18 0.01 False

graph
Test Report

Repro

General Docs link: https://github.com/dotnet/performance/blob/main/docs/benchmarking-workflow-dotnet-runtime.md

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f net8.0 --filter 'System.Collections.Perf_Frozen&lt;Int16&gt;*'

System.Collections.Perf_Frozen<Int16>.ToFrozenDictionary(Count: 4)

ETL Files

Histogram

JIT Disasms

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@LoopedBard3 LoopedBard3 changed the title [Perf] Windows/x64: 3 Regressions on 1/28/2025 11:31:49 PM +00:00 [Perf] Windows/x64: Half and FrozenDictionary Regressions on 1/28/2025 11:31:49 PM +00:00 Feb 4, 2025
@LoopedBard3 LoopedBard3 transferred this issue from dotnet/perf-autofiling-issues Feb 4, 2025
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 4, 2025
@LoopedBard3 LoopedBard3 added tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark labels Feb 4, 2025
@jeffschwMSFT jeffschwMSFT added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 6, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Feb 6, 2025
@JulieLeeMSFT JulieLeeMSFT added this to the 10.0.0 milestone Feb 6, 2025
@xtqqczze
Copy link
Contributor

xtqqczze commented Feb 7, 2025

@tannergooding Is this beneficial? 2 bytes higher in code size, but 0.75 lower PerfScore.

MihuBot/runtime-utils#854

 ; Assembly listing for method System.Half:op_Explicit(float):System.Half (FullOpts)
+       xor      ecx, ecx
+       mov      edx, -1
        vucomiss xmm0, xmm0
-       setp     cl
-       movzx    rcx, cl
-       dec      ecx
+       cmovnp   ecx, edx

Related: #61761 (comment)

Originally posted by @xtqqczze in #111024 (comment)

@stephentoub
Copy link
Member

The ToFrozenDictionary one is acceptable. #112298 will help a little, but this is a 9ns regression when constructing the FrozenDictionary, in order to improve the throughput of accessing it. That kind of tradeoff is the main reason this type exists: we're ok spending a bit more at construction in order to optimize throughput of reads.

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI os-windows runtime-coreclr specific to the CoreCLR runtime tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

No branches or pull requests

7 participants