I have a program where I'm keeping track of the success of various things using collections.Counter
— each success of a thing increments the corresponding counter:
import collections
scoreboard = collections.Counter()
if test(thing):
scoreboard[thing]+ = 1
Then, for future tests, I want to skew towards things which have generated the most success. Counter.elements()
seemed ideal for this, since it returns the elements (in arbitrary order) repeated a number of times equal to the count. So I figured I could just do:
import random
nextthing=random.choice(scoreboard.elements())
But no, that raises TypeError: object of type 'itertools.chain' has no len(). Okay, so random.choice
can't work with iterators. But, in this case, the length is known (or knowable) — it's sum(scoreboard.values())
.
I know the basic algorithm for iterating through a list of unknown length and fairly picking an element at random, but I suspect that there's something more elegant. What should I be doing here?
You could wrap the iterator in
list()
to convert it into a list forrandom.choice()
:The downside here is that this expands the list in memory, rather than accessing it item-by-item as would normally get with an iterator.
If you wanted to solve this iteratively, this algorithm is probably a good choice.
The following will get a random item where the score is the weighting for how often to return that item.
for instance, if there are 2 items:
item1 has a score 5
item2 has a score 10
item2 will be returned twice as often as item1
Another variant, Setup is a bit cumbersome, but lookup is in logarithmic complexity (suitable when several lookups are needed):
You can do this rather easily by using
itertools.islice
to get the Nth item of an iterable:Given a dictionary of choices with corresponding relative probabilities (can be the count in your case), you can use the new
random.choices
added in Python 3.6 like so:Another variant with iteration: