Emulate Python's random.choice in .NET

2019-06-17 03:45发布

问题:

Python's module 'random' has a function random.choice

random.choice(seq)
Return a random element from the non-empty sequence seq. If seq is empty, raises IndexError.

How can I emulate this in .NET ?

public T RandomChoice<T> (IEnumerable<T> source)

Edit: I heard this as an interview question some years ago, but today the problem occurred naturally in my work. The interview question was stated with constraints

  • 'the sequence is too long to save to memory'
  • 'you can only loop over the sequence once'
  • 'the sequence doesn't have a length/count method' (à la .NET IEnumerable)

回答1:

To make a method that iterates the source only once, and doesn't have to allocate memory to store it temporarily, you count how many items you have iterated, and determine the probability that the current item should be the result:

public T RandomChoice<T> (IEnumerable<T> source) {
  Random rnd = new Random();
  T result = default(T);
  int cnt = 0;
  foreach (T item in source) {
    cnt++;
    if (rnd.Next(cnt) == 0) {
      result = item;
    }
  }
  return result;
}

When you are at the first item, the probability is 1/1 that it should be used (as that is the only item that you have seen this far). When you are at the second item, the probability is 1/2 that it should replace the first item, and so on.


This will naturally use a bit more CPU, as it creates one random number per item, not just a single random number to select an item, as dasblinkenlight pointed out. You can check if the source implements IList<T>, as Dan Tao suggested, and use an implementation that uses the capabilities to get the length of the collection and access items by index:

public T RandomChoice<T> (IEnumerable<T> source) {
  IList<T> list = source as IList<T>;
  if (list != null) {
    // use list.Count and list[] to pick an item by random
  } else {
    // use implementation above
  }
}

Note: You should consider sending the Random instance into the method. Otherwise you will get the same random seed if you call the method two times too close in time, as the seed is created from the current time.


The result of a test run, picking one number from an array containing 0 - 9, 1000000 times, to show that the distribution of the chosen numbers is not skewed:

0: 100278
1: 99519
2: 99994
3: 100327
4: 99571
5: 99731
6: 100031
7: 100429
8: 99482
9: 100638


回答2:

To avoid iterating through the sequence two times (once for the count and once for the element) it is probably a good idea to save your sequence in an array before getting its random element:

public static class RandomExt {
    private static Random rnd = new Random();
    public static T RandomChoice<T> (this IEnumerable<T> source) {
        var arr = source.ToArray();   
        return arr[rnd.Next(arr.Length)];
    }
    public static T RandomChoice<T> (this ICollection<T> source) {
        return source[rnd.Next(rnd.Count)];
    }
}

EDIT Implemented a very good idea by Chris Sinclair.



回答3:

        public T RandomChoice<T> (IEnumerable<T> source)
        {
            if (source == null)
            {
                throw new ArgumentNullException("source");
            }

            var list = source.ToList();

            if (list.Count < 1)
            {
                throw new MissingMemberException();
            }

            var rnd = new Random();
            return list[rnd.Next(0, list.Count)];
        }

or extension

    public static T RandomChoice<T> (this IEnumerable<T> source)
    {
        if (source == null)
        {
            throw new ArgumentNullException("source");
        }

        var list = source.ToList();

        if (list.Count < 1)
        {
            throw new MissingMemberException();
        }

        var rnd = new Random();
        return list[rnd.Next(0, list.Count)];
    }


回答4:

private static Random rng = new Random();

...
return source.Skip(rng.next(source.Count())).Take(1);


回答5:

I'd go with dasblinkenlight's answer, with one small change: leverage the fact that source might already be an indexed collection, in which case you really don't need to populate a new array (or list):

public static class RandomExt
{
    public static T Choice<T>(this Random random, IEnumerable<T> sequence)
    {
        var list = sequence as IList<T> ?? sequence.ToList();
        return list[random.Next(list.Count)];
    }
}

Note that I also modified the interface from the abovementioned answer to make it more consistent with the Python version you referenced in your question:

var random = new Random();
var numbers = new int[] { 1, 2, 3 };
int randomNumber = random.Choice(numbers);

Edit: I like Guffa's answer even better, actually.



回答6:

Well, get a list of all elements in the sequence. ask a random number generator for the index, return elemnt by index. Define what Sequence is - IEnumerable would be most obvious, but you need to materialize that into a list then to know the number of elements for the random number generator. This is btw., not emulate, it is implement.

Is this some homework beginner study course question?



回答7:

Assuming one has an extension method IEnumerable.MinBy:

var r = new Random();
return source.MinBy(x=>r.Next())

The method MinBy doesn't save the sequence to memory, it works like IEnumerable.Min making one iteration (see MoreLinq or elsewhere )



标签: c# .net random