Implement SelectMergeTree via the model.Tree#4790
Conversation
SelectMergeTree via the query
SelectMergeTree via the querySelectMergeTree via the model.Tree
232757f to
6b0b68b
Compare
0aa934b to
7f557ec
Compare
| @@ -0,0 +1,614 @@ | |||
| package symdb | |||
There was a problem hiding this comment.
This is the most complex part of this PR, worth starting to look at it @aleks-p
There was a problem hiding this comment.
@simonswine looks good to me overall, I think we can move it forward once we address the few TODOs
pkg/phlaredb/symdb/symbol_merger.go
Outdated
| h.sl = slices.Grow(h.sl, size) | ||
| h.hashes = slices.Grow(h.hashes, size) | ||
| // grow map if still empty | ||
| if len(h.m) == 0 { |
There was a problem hiding this comment.
nit, this is perhaps never hit, we always call add when creating new mergers
pkg/phlaredb/symdb/symbol_merger.go
Outdated
| type equalityFn[A any] func(A, A) bool | ||
|
|
||
| type hashedSlice[A any] struct { | ||
| merger *SymbolMerger |
There was a problem hiding this comment.
nit, it doesn't seem like we use this
| @@ -0,0 +1,614 @@ | |||
| package symdb | |||
There was a problem hiding this comment.
@simonswine looks good to me overall, I think we can move it forward once we address the few TODOs
- [x] TODO fix "other" - [ ] Make sure truncation works on max symbols rather than max nodes - [x] Consistently number IDs starting from 1 - [ ] Share logic of getting pprof from tree for profilecli
hashedSlice.add() previously panicked on 64-bit xxhash collisions. Replace the panic with linear probing on the map key so collisions are handled gracefully: the probe hash is incremented until an empty slot is found, while the original content hash is always stored in hashes[] for use by parent-level hash computation. Add TestSymbolMerger_HashCollision to verify that two distinct values sharing the same hash receive distinct indices, that original hashes are preserved, and that deduplication still works for both values.
bb7c19c to
948c5fe
Compare
|
I really should have enabled the integration tests a longer time ago, basically the symbolizer isn't able to work correctly with this feature. I will have to spent some more time on this. |
We currently use a completely separate query path for SelectMergeStacktraces vs. SelectMergePprof calls. This PR is a first step in unifying the two paths, by implementing a model.Tree based merge.
This refactors model.Tree to be generic over its node name type, supporting both string-based function names (FuntionName) and integer location references (LocationRefName). This enables the query-backend path to build flamegraph trees using location refs directly, deferring string resolution until the final merge step.
Most complexity is in the SymbolMerger in symdb that deduplicates and merges symbols (strings, mappings, functions, locations) across profile blocks using content hashing (xxhash), with linear probing to gracefully handle hash collisions.
After merging this, the old path is still active by default, there is a per-tenant override, gating the tree-based SelectMergeProfile code path. Also adds test coverage for the query backend, frontend query path, and the symbol merger.
Note: The integration tests after this PR will only target the tree based variant.