Skip to content

Commit c0104d3

Browse files
authored
Precalculated attribute set hashes (#1407)
The hash of `AttributeSet`s are expensive to compute, as they have to be computed for each key and value in the attribute set. This hash is used by the `ValueMap` to look up if we are already aggregating a time series for this set of attributes or not. Since this hashmap lookup occurs inside a mutex lock, no other counters can execute their `add()` calls while this hash is being calculated, and therefore contention in high throughput scenarios exists. This PR calculates and caches the hashmap at creation time. This improves throughput because the hashmap is calculated by the thread creating the `AttributeSet` and is performed outside of any mutex locks, meaning hashes can be computed in parallel and the time spent within a mutex lock is reduced. As larger sets of attributes are used for time series, the benefits of reduction of lock times should be greater. The stress test results of this change for different thread counts are: | Thread Count | Main | PR | | -------------- | ---------- | --------- | | 2 | 3,376,040 | 3,310,920 | | 3 | 5,908,640 | 5,807,240 | | 4 | 3,382,040 | 8,094,960 | | 5 | 1,212,640 | 9,086,520 | | 6 | 1,225,280 | 6,595,600 | The non-precomputed hashes starts feeling contention with 4 threads, and drops substantially after that while precomputed hashes doesn't start seeing contention until 6 threads, and even then we still have 5-6x more throughput after contention due to reduced locking times. While these benchmarks may not be "realistic" (since most applications will be doing more work in between counter updates) it does show a benefit of better parallelism and the opportunity to reduce lock contention at the cost of only 8 bytes per time series (so a total of 16KB additional memory at maximum cardinality).
1 parent 897e70a commit c0104d3

File tree

1 file changed

+24
-8
lines changed
  • opentelemetry-sdk/src/attributes

1 file changed

+24
-8
lines changed

opentelemetry-sdk/src/attributes/set.rs

+24-8
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
use std::collections::hash_map::DefaultHasher;
12
use std::collections::HashSet;
23
use std::{
34
cmp::Ordering,
@@ -104,13 +105,13 @@ impl Eq for HashKeyValue {}
104105
///
105106
/// This must implement [Hash], [PartialEq], and [Eq] so it may be used as
106107
/// HashMap keys and other de-duplication methods.
107-
#[derive(Clone, Default, Debug, Hash, PartialEq, Eq)]
108-
pub struct AttributeSet(Vec<HashKeyValue>);
108+
#[derive(Clone, Default, Debug, PartialEq, Eq)]
109+
pub struct AttributeSet(Vec<HashKeyValue>, u64);
109110

110111
impl From<&[KeyValue]> for AttributeSet {
111112
fn from(values: &[KeyValue]) -> Self {
112113
let mut seen_keys = HashSet::with_capacity(values.len());
113-
let mut vec = values
114+
let vec = values
114115
.iter()
115116
.rev()
116117
.filter_map(|kv| {
@@ -121,25 +122,34 @@ impl From<&[KeyValue]> for AttributeSet {
121122
}
122123
})
123124
.collect::<Vec<_>>();
124-
vec.sort_unstable();
125125

126-
AttributeSet(vec)
126+
AttributeSet::new(vec)
127127
}
128128
}
129129

130130
impl From<&Resource> for AttributeSet {
131131
fn from(values: &Resource) -> Self {
132-
let mut vec = values
132+
let vec = values
133133
.iter()
134134
.map(|(key, value)| HashKeyValue(KeyValue::new(key.clone(), value.clone())))
135135
.collect::<Vec<_>>();
136-
vec.sort_unstable();
137136

138-
AttributeSet(vec)
137+
AttributeSet::new(vec)
139138
}
140139
}
141140

142141
impl AttributeSet {
142+
fn new(mut values: Vec<HashKeyValue>) -> Self {
143+
values.sort_unstable();
144+
let mut hasher = DefaultHasher::new();
145+
values.iter().fold(&mut hasher, |mut hasher, item| {
146+
item.hash(&mut hasher);
147+
hasher
148+
});
149+
150+
AttributeSet(values, hasher.finish())
151+
}
152+
143153
/// Returns the number of elements in the set.
144154
pub fn len(&self) -> usize {
145155
self.0.len()
@@ -163,3 +173,9 @@ impl AttributeSet {
163173
self.0.iter().map(|kv| (&kv.0.key, &kv.0.value))
164174
}
165175
}
176+
177+
impl Hash for AttributeSet {
178+
fn hash<H: Hasher>(&self, state: &mut H) {
179+
state.write_u64(self.1)
180+
}
181+
}

0 commit comments

Comments
 (0)