Skip to content

Conversation

@sjakobi
Copy link
Member

@sjakobi sjakobi commented Oct 20, 2025

This also incorporates the switch from A.index to A.index# proposed in #538.

And we remove some dead alternatives: When performing a deletion from a BitmapIndexed node with a single subtree, the result cannot be Empty, due to invariant 5, which implies that a BitmapIndexed node must lead to at least two Leaf or Collision nodes.

-- * If a 'BitmapIndexed' node has only one sub-node, this sub-node must
-- be a 'BitmapIndexed' or a 'Full' node. (INV5)

@sjakobi
Copy link
Member Author

sjakobi commented Oct 21, 2025

While this does improve the code size, I somehow see a negative effect on the delete.presentKey.Int benchmark on my laptop.

Maybe this made the memory read less predictable or something like that?!

I plan on running the benchmarks on a different machine as an additional check.

@sjakobi sjakobi mentioned this pull request Oct 28, 2025
3 tasks
@treeowl
Copy link
Collaborator

treeowl commented Oct 28, 2025

The regression is certainly unfortunate. I wonder if the problem is that the CPU has to worry more about aliasing with the mysterious index. I wonder if we could help it out by using something like 1 .&&. complement i instead of 1 - i. Another, nicer, possibility is I# (i# ==# 0#), but I suspect GHC may branchify that.

@sjakobi
Copy link
Member Author

sjakobi commented Oct 28, 2025

Another, nicer, possibility is I# (i# ==# 0#), but I suspect GHC may branchify that.

Yes, exactly.

xor 1 turned out to work well, and even seems to be slightly faster than the old pattern.

@treeowl
Copy link
Collaborator

treeowl commented Oct 28, 2025

Why would xor 1 i be easier than 1 - i? Without a priori knowledge about i, neither of those constrains the result in any way. Puzzling. I hope this isn't some silly alignment nonsense tripping up benchmarks.

@sjakobi
Copy link
Member Author

sjakobi commented Oct 28, 2025

It also seems to speed up the new difference function slightly:

Before:

All
  HashMap.Strict
    difference
      overlap
        Int
          100:    OK
            896  ns ±  11 ns
          1000:   OK
            9.64 μs ±  49 ns
          10000:  OK
            130  μs ± 346 ns
          100000: OK
            1.71 ms ±  12 μs

With xor 1:

All
  HashMap.Strict
    difference
      overlap
        Int
          100:    OK
            888  ns ± 1.5 ns
          1000:   OK
            9.17 μs ±  60 ns
          10000:  OK
            129  μs ± 1.1 μs
          100000: OK
            1.69 ms ± 2.8 μs

@sjakobi
Copy link
Member Author

sjakobi commented Oct 28, 2025

Hmm, I suspect I allowed myself to be misled by the noise in the benchmark results.

After more benchmark runs I can't find a significant difference between 1 - i, xor 1 i and 1 .&. complement i. I think I'll just opt for 1 - i which seems like the clearest option.

The code size reductions are pretty nice in any case.

@sjakobi sjakobi force-pushed the sjakobi/delete-indices branch from 79b0f7e to 4dece38 Compare October 28, 2025 22:02
@sjakobi sjakobi marked this pull request as ready for review October 28, 2025 22:06
@sjakobi sjakobi merged commit 5662a47 into master Oct 29, 2025
9 checks passed
@sjakobi sjakobi deleted the sjakobi/delete-indices branch October 29, 2025 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants