Python's module 'random' has a function random.choice
random.choice(seq)
Return a random element from the non-empty sequence seq. If seq
is empty, raises IndexError
.
How can I emulate this in .NET ?
public T RandomChoice<T> (IEnumerable<T> source)
Edit: I heard this as an interview question some years ago, but today the problem occurred naturally in my work. The interview question was stated with constraints
- 'the sequence is too long to save to memory'
- 'you can only loop over the sequence once'
- 'the sequence doesn't have a length/count method' (à la .NET IEnumerable)
To make a method that iterates the source only once, and doesn't have to allocate memory to store it temporarily, you count how many items you have iterated, and determine the probability that the current item should be the result:
public T RandomChoice<T> (IEnumerable<T> source) {
Random rnd = new Random();
T result = default(T);
int cnt = 0;
foreach (T item in source) {
cnt++;
if (rnd.Next(cnt) == 0) {
result = item;
}
}
return result;
}
When you are at the first item, the probability is 1/1 that it should be used (as that is the only item that you have seen this far). When you are at the second item, the probability is 1/2 that it should replace the first item, and so on.
This will naturally use a bit more CPU, as it creates one random number per item, not just a single random number to select an item, as dasblinkenlight pointed out. You can check if the source implements IList<T>
, as Dan Tao suggested, and use an implementation that uses the capabilities to get the length of the collection and access items by index:
public T RandomChoice<T> (IEnumerable<T> source) {
IList<T> list = source as IList<T>;
if (list != null) {
// use list.Count and list[] to pick an item by random
} else {
// use implementation above
}
}
Note: You should consider sending the Random
instance into the method. Otherwise you will get the same random seed if you call the method two times too close in time, as the seed is created from the current time.
The result of a test run, picking one number from an array containing 0 - 9, 1000000 times, to show that the distribution of the chosen numbers is not skewed:
0: 100278
1: 99519
2: 99994
3: 100327
4: 99571
5: 99731
6: 100031
7: 100429
8: 99482
9: 100638
To avoid iterating through the sequence two times (once for the count and once for the element) it is probably a good idea to save your sequence in an array before getting its random element:
public static class RandomExt {
private static Random rnd = new Random();
public static T RandomChoice<T> (this IEnumerable<T> source) {
var arr = source.ToArray();
return arr[rnd.Next(arr.Length)];
}
public static T RandomChoice<T> (this ICollection<T> source) {
return source[rnd.Next(rnd.Count)];
}
}
EDIT Implemented a very good idea by Chris Sinclair.
public T RandomChoice<T> (IEnumerable<T> source)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
var list = source.ToList();
if (list.Count < 1)
{
throw new MissingMemberException();
}
var rnd = new Random();
return list[rnd.Next(0, list.Count)];
}
or extension
public static T RandomChoice<T> (this IEnumerable<T> source)
{
if (source == null)
{
throw new ArgumentNullException("source");
}
var list = source.ToList();
if (list.Count < 1)
{
throw new MissingMemberException();
}
var rnd = new Random();
return list[rnd.Next(0, list.Count)];
}
private static Random rng = new Random();
...
return source.Skip(rng.next(source.Count())).Take(1);
I'd go with dasblinkenlight's answer, with one small change: leverage the fact that source
might already be an indexed collection, in which case you really don't need to populate a new array (or list):
public static class RandomExt
{
public static T Choice<T>(this Random random, IEnumerable<T> sequence)
{
var list = sequence as IList<T> ?? sequence.ToList();
return list[random.Next(list.Count)];
}
}
Note that I also modified the interface from the abovementioned answer to make it more consistent with the Python version you referenced in your question:
var random = new Random();
var numbers = new int[] { 1, 2, 3 };
int randomNumber = random.Choice(numbers);
Edit: I like Guffa's answer even better, actually.
Well, get a list of all elements in the sequence. ask a random number generator for the index, return elemnt by index. Define what Sequence is - IEnumerable would be most obvious, but you need to materialize that into a list then to know the number of elements for the random number generator.
This is btw., not emulate, it is implement.
Is this some homework beginner study course question?
Assuming one has an extension method IEnumerable.MinBy
:
var r = new Random();
return source.MinBy(x=>r.Next())
The method MinBy
doesn't save the sequence to memory, it works like IEnumerable.Min
making one iteration (see MoreLinq or elsewhere )