I needed to write a weighted version of random.choice (each element in the list has a different probability for being selected). This is what I came up with:
def weightedChoice(choices):
"""Like random.choice, but each element can have a different chance of
being selected.
choices can be any iterable containing iterables with two items each.
Technically, they can have more than two items, the rest will just be
ignored. The first item is the thing being chosen, the second item is
its weight. The weights can be any numeric values, what matters is the
relative differences between them.
"""
space = {}
current = 0
for choice, weight in choices:
if weight > 0:
space[current] = choice
current += weight
rand = random.uniform(0, current)
for key in sorted(space.keys() + [current]):
if rand < key:
return choice
choice = space[key]
return None
This function seems overly complex to me, and ugly. I'm hoping everyone here can offer some suggestions on improving it or alternate ways of doing this. Efficiency isn't as important to me as code cleanliness and readability.
I needed to do something like this really fast really simple, from searching for ideas i finally built this template. The idea is receive the weighted values in a form of a json from the api, which here is simulated by the dict.
Then translate it into a list in which each value repeats proportionally to it's weight, and just use random.choice to select a value from the list.
I tried it running with 10, 100 and 1000 iterations. The distribution seems pretty solid.
Here is another version of weighted_choice that uses numpy. Pass in the weights vector and it will return an array of 0's containing a 1 indicating which bin was chosen. The code defaults to just making a single draw but you can pass in the number of draws to be made and the counts per bin drawn will be returned.
If the weights vector does not sum to 1, it will be normalized so that it does.
Since Python3.6 there is a method
choices
fromrandom
module.And people also mentioned that there is
numpy.random.choice
which support weights, BUT it don't support 2d arrays, and so on.So,
basically you can get whatever you like(see update) with builtinrandom.choices
if you have 3.6.x Python.UPDATE: As @roganjosh kindly mentioned,
random.choices
cannot return values without replacement, as it mentioned in the docs:And @ronan-paixão's brilliant answer states that
numpy.choice
hasreplace
argument, that controls such behaviour.If you have a weighted dictionary instead of a list you can write this
Note that
[k for k in items for dummy in range(items[k])]
produces this list['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'c', 'b', 'b', 'b', 'b', 'b']
If you don't mind using numpy, you can use numpy.random.choice.
For example:
If you know how many selections you need to make in advance, you can do it without a loop like this: