可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Sorry about badly phrased title....

I have an object called NGram

class NGram
{
     //other properties
     double Probability {get; set;} //Value between 1 and 0 
}

Now suppose I have a list of these objects such that...

List<NGrams> grams = GetNGrams();
Debug.Assert(grams.Sum(x => x.Probability) == 1);

How can I select a random item from this list while factoring in the probability distribution.

For instance, suppose grams[0].Probability == 0.5 then there should be a 50% chance of selecting grams[0]

I figured I may need something like rand.NextDouble() but I am at loss.

回答1:

Here's a more generic way to do it (meaning you don't need to assert that the probabilities add to 1):

static Random rand = new Random();

public NGram GetRandom(IEnumerable<NGram> pool)
{
     // get universal probability 
     double u = pool.Sum (p => p.Probability);

     // pick a random number between 0 and u
     double r = rand.NextDouble() * u;

     double sum = 0;
     foreach(NGram n in pool)
     {
         // loop until the random number is less than our cumulative probability
         if(r <= (sum = sum + n.Probability))
         {
            return n;
         }
     }
     // should never get here
     return null;
}

回答2:

Sort the list, order by Probability, ascending.

Sum the Probability field for all elements in the list. Let's call that sum P.

Get a random number between [0, P], let's call it r

Iterate the list while keeping an accumulate value of the sum of Probability up to the current element you are iterating (pe). End the search when find the first element for which pe >= r

The case of all elements in the array to sum 1 is now just an special case :)

回答3:

Try this:

List<NGram> grams = new List<NGram>()
{
    new NGram() { Probability = 0.5 },
    new NGram() { Probability = 0.35 },
    new NGram() { Probability = 0.15 }
};

var rnd = new Random();

var result =
    grams
        .Aggregate(
            new { sum = 0.0, target = rnd.NextDouble(), gram = (NGram)null },
            (a, g) =>
                a.gram == null && a.sum + g.Probability >= a.target
                    ? new { sum = a.sum + g.Probability, a.target, gram = g }
                    : new { sum = a.sum + g.Probability, a.target, a.gram });

It gives me results like this:

回答4:

In pseudo code

r = Get a random number between 0 and 1
sum = 0
i = 0
Loop  
    sum = sum + grams[i].Probability  
    If sum >= r Then  
        Exit Loop
    End
    i = i + 1  
End
i is the index of the random item in the list

The idea is to sum up the item's probability until the sum is greater or equal than a random number. Since the probabilities sum up to 1 and the random number is in the range 0 .. 1, you will find an item in any case. Items with a greater probability are more likely to be chosen.

∑P= 0 0.08     0.3 0.43 0.53          0.88  1
    +--+--------+----+---+-------------+----+
    |  |        |    |   |             |    |
    +--+--------+----+---+-------------+----+ 
i =  0      1      2   3       4         5

You can imagine each item to be of a length equal to its assigned probability. The algorithm is like randomly throwing a dart at a ruler of length one, with all the probabilities stacked up along the ruler. The probability for an item to be hit is proportional to its size (i.e. its assigned probability).