(Micro)Benchmarking reliability and consistency

## Motivation

A frequent request for scala/scala PRs (particularly collections changes) is that the changes be benchmarked; however, many obstacles exist for contributors running benchmarks on their personal computers, to the extent that many or perhaps most results would generously be classified as "questionable".

## Background

The following are some common causes of performance/timing variance, and whether a particular type of machine avoids it.

| | Laptop | Overclocked Desktop | Normally-clocked Desktop |
|------------------------------|:--------------:|:-------------------:|:------------------------:|
| Boost clock speed change | ❌1 | ❌ | ✔ |
| Thermal throttling | ❌ | ❌ | ❓2 |
| Background tasks | ❌ | ❌ | ❌ |

1 While theoretically possible to turn off overclocking/boost-clocking on a laptop, the CPU may clock down due to even brief changes in battery/power state as well.
2 A normally-clocked desktop with good ventilation and cooling ***shouldn't*** thermally throttle, but neither of those is a guarantee in a person's home (sometimes cats sit on computers, for example).

The only type of machine that avoids *any* of these issues is a normally-clocked desktop, and not everyone has one of those (many of us only have laptops).

Additionally, all personal computers suffer from the problem that there are almost certainly background tasks (if not foreground tasks) running on them at all times. Benchmarks can take a long time to run, and even if someone can manage to not use their computer for an hour or two while benchmarks run, they probably don't want to have to close their web browser, 3+ chat applications (that are all electron, so basically also web browsers), and half a dozen other running programs and services. If they can't spare potentially multiple hours of their computer being tied up, it's even worse, with foreground tasks taking arbitrary and inconsistent CPU time.

## Ideal Setup

To have benchmarking be reliable, it should be done on a dedicated machine running nothing else, and where cron/scheduled jobs are never running while a benchmark is running.

--------

How do we reliably benchmark library changes?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(Micro)Benchmarking reliability and consistency #753

Motivation

Background

Ideal Setup

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	Laptop	Overclocked Desktop	Normally-clocked Desktop
Boost clock speed change	❌¹	❌	✔
Thermal throttling	❌	❌	❓²
Background tasks	❌	❌	❌

(Micro)Benchmarking reliability and consistency #753

Description

Motivation

Background

Ideal Setup

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions