Skip to content

Try using a byte array in ArrayHitCounter instead of a short array #613

@alexklibisz

Description

@alexklibisz

Background

ArrayHitCounter uses an array of shorts to count hits. It's not a very memory-efficient implementation, as it requires an array entry for every document in the segment. So it uses shorts because a short requires half the memory of an int, and counts should rarely exceed the max value of a short.

I think an array of bytes would also work, and would require half the memory. This could be implemented as a new implementation of the HitCounter interface: rename the current one to ShortArrayHitCounter and add a new one ByteArrayHitCounter. The max value that fits in a byte is 256. So if the number of hashes passed to MatchHashesAndScoreQuery is <= 256, it uses the ByteArrayHitCounter, else it uses the ShortArrayHitCounter.

Bard already wrote most of it for me:

image image

Deliverables

  • Implement a ByteArrayHitCounter
  • Benchmark it

Related Issues

#611

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions