-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Welcome to the weighted-lottery wiki!
Weighted lotteries are used when we need to randomly select k of n items with given probabilities.
- The sample size k is typically much smaller than n
- The sample can ask for unique items, or allow repetitions
This project offers several weighted lottery implementations written in Kotlin or Java
<dependency>
<groupId>io.github.guyko</groupId>
<artifactId>WeightedLottery</artifactId>
<version>1.0</version>
</dependency>
IntLottery represents the facade for obtaining the next item
interface IntLottery {
fun draw(): Int
fun empty(): Boolean
fun remaining(): Int
}
All implementations of IntLottery expect to get a DoubleArray of probabilities, where an index in the array will be returned with each call to draw according to its given probability
val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery : IntLottery = ... // some implementation, initialized with the weights
(0 until k).forEach {
val index = lottery.draw()
// do something
}
With SimpleWeightedLottery, one can draw as many times as needed - with the complexity of log(n)
val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = SimpleWeightedLottery(weights)
val k = ... // no limitation on the value of k
(0 until k).forEach {
val index = lottery.draw()
// do something
}
AliasLottery is based on alias method sampling algorithm, with preprocessing complexity of O(n), and draw complexity of O(1) (cost of 2 random calls)
TwistedAliasLottery delegates draw calls to the SimpleWeightedLottery while building an AliasLottery in the background using a different thread. When init is done - it switches the delegated implementation
SimpleWeightedLotteryNoRepetitions wraps a SimpleWeightedLottery and tries to minimize number of re-draws by keeping track of the chance to hit the same element twice. In case a threshold is crossed - the internal SimpleWeightedLottery is re-allocated. Complexity is hard to analyze as it depends on the type of distribution
val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = SimpleWeightedLotteryNoRepetitions(weights)
val k = ... // k < weights.size
(0 until k).forEach {
val index = lottery.draw()
// do something
}
SumTreeLottery holds a sum tree data structure, with O(n) complexity for preprocessing and log(n) for draw (in oppose to the 'Simple' implementation - no need for any reallocation/preprocessing)
ReservoirLottery is based on A-Chao's weighted version of reservoir sampling algorithm, with preprocessing complexity of O(n+klogk), draw complexity of O(1). However, it depends on prior knowledge of k
val weights = doubleArrayOf(0.15, 0.0, 0.2, 0.0, 0.65)
val lottery = ReservoirLottery(weights, k = 2)
val first = lottery.draw()
val second = lottery.draw()
lottery.draw() // NoSuchElementException
All implementations are benchmarked using jmh, with gc profiler enabled
- 'with repetitions' vs 'without repetitions'
- When measuring, we may care about preprocessing phase, draw phase in other cases and sometimes both together
- one sample of k items vs many samples of k items (some implementations may be better when reusing the same weights)
- uniform vs exponential distribution (0.9^1, 0.9^2, 0.9^3, .....) of probabilities
Follow benchmark wiki page to further deep dive
The results of the latest benchmark can be found here
New implementations are welcome. we prefer you to use Kotlin, but Java is also possible
Your pull request must contain:
- Short description of your logic
- Implement IntLottery in WeightedLottery/src/main/java/com/wl
- Implement either WeightedLotteryNoRepetitionsTestBase or WeightedLotteryWithRepetitionsTestBase in src/test/java/com/wl
- Add benchmark methods as described here