You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Auto merge of #38368 - arthurprs:hm-adapt, r=alexcrichton
Adaptive hashmap implementation
All credits to @pczarn who wrote rust-lang/rfcs#1796 and contain-rs/hashmap2#5
**Background**
Rust std lib hashmap puts a strong emphasis on security, we did some improvements in #37470 but in some very specific cases and for non-default hashers it's still vulnerable (see #36481).
This is a simplified version of rust-lang/rfcs#1796 proposal sans switching hashers on the fly and other things that require an RFC process and further decisions. I think this part has great potential by itself.
**Proposal**
This PR adds code checking for extra long probe and shifts lengths (see code comments and rust-lang/rfcs#1796 for details), when those are encountered the hashmap will grow (even if the capacity limit is not reached yet) _greatly_ attenuating the degenerate performance case.
We need a lower bound on the minimum occupancy that may trigger the early resize, otherwise in extreme cases it's possible to turn the CPU attack into a memory attack. The PR code puts that lower bound at half of the max occupancy (defined by ResizePolicy). This reduces the protection (it could potentially be exploited between 0-50% occupancy) but makes it completely safe.
**Drawbacks**
* May interact badly with poor hashers. Maps using those may not use the desired capacity.
* It adds 2-3 branches to the common insert path, luckily those are highly predictable and there's room to shave some in future patches.
* May complicate exposure of ResizePolicy in the future as the constants are a function of the fill factor.
**Example**
Example code that exploit the exposure of iteration order and weak hasher.
```
const MERGE: usize = 10_000usize;
#[bench]
fn merge_dos(b: &mut Bencher) {
let first_map: $hashmap<usize, usize, FnvBuilder> = (0..MERGE).map(|i| (i, i)).collect();
let second_map: $hashmap<usize, usize, FnvBuilder> = (MERGE..MERGE * 2).map(|i| (i, i)).collect();
b.iter(|| {
let mut merged = first_map.clone();
for (&k, &v) in &second_map {
merged.insert(k, v);
}
::test::black_box(merged);
});
}
```
_91 is stdlib and _ad is patched (the end capacity in both cases is the same)
```
running 2 tests
test _91::merge_dos ... bench: 47,311,843 ns/iter (+/- 2,040,302)
test _ad::merge_dos ... bench: 599,099 ns/iter (+/- 83,270)
```
0 commit comments