I needed to write a weighted version of random.choice (each element in the list has a different probability for being selected). This is what I came up with:
def weightedChoice(choices):
"""Like random.choice, but each element can have a different chance of
being selected.
choices can be any iterable containing iterables with two items each.
Technically, they can have more than two items, the rest will just be
ignored. The first item is the thing being chosen, the second item is
its weight. The weights can be any numeric values, what matters is the
relative differences between them.
"""
space = {}
current = 0
for choice, weight in choices:
if weight > 0:
space[current] = choice
current += weight
rand = random.uniform(0, current)
for key in sorted(space.keys() + [current]):
if rand < key:
return choice
choice = space[key]
return None
This function seems overly complex to me, and ugly. I'm hoping everyone here can offer some suggestions on improving it or alternate ways of doing this. Efficiency isn't as important to me as code cleanliness and readability.
Using numpy
I looked the pointed other thread and came up with this variation in my coding style, this returns the index of choice for purpose of tallying, but it is simple to return the string ( commented return alternative):
Since version 1.7.0, NumPy has a
choice
function that supports probability distributions.Note that
probability_distribution
is a sequence in the same order oflist_of_candidates
. You can also use the keywordreplace=False
to change the behavior so that drawn items are not replaced.0.0 <= x < total
.If you need to make more than one choice, split this into two functions, one to build the cumulative weights and another to bisect to a random point.
One way is to randomize on the total of all the weights and then use the values as the limit points for each var. Here is a crude implementation as a generator.
If your list of weighted choices is relatively static, and you want frequent sampling, you can do one O(N) preprocessing step, and then do the selection in O(1), using the functions in this related answer.