Selecting without any weights (equal probabilities) is beautifully described here.
I was wondering if there is a way to convert this approach to a weighted one.
I am also interested in other approaches as well.
Update: Sampling without replacement
I've done this in Ruby
https://github.com/fl00r/pickup
If you want to generate large arrays of random integers with replacement, you can use piecewise linear interpolation. For example, using NumPy/SciPy:
This is not effective if you want to sample without replacement.
If you want to pick x elements from a weighted set without replacement such that elements are chosen with a probability proportional to their weights:
Here's a Go implementation from geodns:
I have implemented an algorithm similar to Jason Orendorff's idea in Rust here. My version additionally supports bulk operations: insert and remove (when you want to remove a bunch of items given by their ids, not through the weighted selection path) from the data structure in
O(m + log n)
time where m is the number of items to remove and n the number of items in stored.If the sampling is with replacement, use the roulette-wheel selection technique (often used in genetic algorithms):
[0,1]*totalWeight
k
timesIf the sampling is without replacement, you can adapt the above technique by removing the selected element from the list after each iteration, then re-normalizing the weights so that their sum is 1 (valid probability distribution function)