Select random item with weight

2019-01-22 23:19发布

问题:

I have a list of approx. 10000 items. The current situation is that every item has an associated weight (priority or importance). Now the smallest weight is -100 (negative and zero values can be removed) and the highest weight is 1500. Weight is determined by intuition by people (how somebody thinks the item is important to community). Because it's not easy to determine the most important item, I'd like to use some random factor, so that items with lower weight will have less chance to be chosen and their weight will be adjusted in the future (some mix of common sense and randomness).

Do you know how to code a function getItem?

def getItem(dict):
  # this function should return random item from 
  # the dictionary of item-weight pairs (or list of tuples)
  # Normally I would return only random item from the dictionary,
  # but now I'd like to have this: The item with weight 1500 should
  # have much more chance to be returned than the item with weight 10.
  # What's my idea is to sum up the weights of all items and then compute
  # some ratios. But maybe you have better idea.
  return randomItem

Thank you

回答1:

Have a look at this, i think it's what you need with some nice comparision between different methods Weighted random generation in Python

The simplest approach suggested is:

import random

def weighted_choice(weights):
    totals = []
    running_total = 0

    for w in weights:
        running_total += w
        totals.append(running_total)

    rnd = random.random() * running_total
    for i, total in enumerate(totals):
        if rnd < total:
            return i

You can find more details and possible improvements as well as some different approaches in the link above.



回答2:

Python 3.6 introduced random.choices()

def get_item(items, items_weights):
    return random.choices(items, weights=items_weights)[0]


回答3:

You should extract a random number between 0 and the sum of weights (positive by definition). Then you get the item from a list by using bisect: http://docs.python.org/library/bisect.html (the bisect standard moduke).

import random 
import bisect
weight = {'a':0.3,'b':3.2,'c':2.4}
items = weight.keys()
mysum = 0
breakpoints = [] 
for i in items:
    mysum += weight[i]
    breakpoints.append(mysum)

def getitem(breakpoints,items):
    score = random.random() * breakpoints[-1]
    i = bisect.bisect(breakpoints, score)
    return items[i] 

print getitem(breakpoints,items)


回答4:

It's easier to do if the weights are not negative. If you have to have negative weights, you'll have to offset the weights by the lowest possible weight. In your case, offsetted_weight = itemweight + 100

In pseudocode, it goes like this:

Calculate the sum of all the weights.
Do a random from 0 to the sum of the weights
Set i to 0
While the random number > 0
    Subtract the weight of the item at index i  from random
    If the random number is < 0 return item[i]
    Add 1 to i


回答5:

If you're storing your data in a database, you can use SQL:

SELECT * FROM table ORDER BY weight*random() DESC LIMIT 1


标签: python random