Skip to content

Better benchmarks for intersection (and better intersectionArraysBy) #416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sjakobi opened this issue Apr 15, 2022 · 0 comments
Open

Comments

@sjakobi
Copy link
Member

sjakobi commented Apr 15, 2022

Here are the intersection benchmarks as of #406:

, bgroup "intersection"
[ bench "Int" $ whnf (HM.intersection hmi) hmi2
, bench "ByteString" $ whnf (HM.intersection hmbs) hmbsSubset
]

elemsBS = zip keysBS [1..n]
keysBS = UBS.rnd 8 n
elemsI = zip keysI [1..n]
keysI = UI.rnd (n+n) n
elemsI2 = zip [n `div` 2..n + (n `div` 2)] [1..n] -- for union

hmbs = HM.fromList elemsBS
hmbsSubset = HM.fromList (takeSubset n elemsBS)
hmi = HM.fromList elemsI
hmiSubset = HM.fromList (takeSubset n elemsI)
hmi2 = HM.fromList elemsI2

takeSubset n elements =
-- use 50% of the elements for a subset check.
let subsetSize = round (fromIntegral n * 0.5 :: Double) :: Int
in take subsetSize elements

I think the maps in these benchmarks have an unusually large overlap – there are relatively few elements that do not lie in the intersection. It would be interesting to have benchmarks where the intersection is relatively smaller.

With such benchmarks it may also turn out that it is faster to iterate only over the intersections of the array nodes in intersectionArrayBy, as implemented in oberblastmeister@1ee67ea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant