I need a quick algorithm to select 5 random elements from a generic list. For example, I'd like to get 5 random elements from a List<string>
.
相关问题
- Sorting 3 numbers without branching [closed]
- Graphics.DrawImage() - Throws out of memory except
- Why am I getting UnauthorizedAccessException on th
- 求获取指定qq 资料的方法
- How to know full paths to DLL's from .csproj f
Was thinking about comment by @JohnShedletsky on the accepted answer regarding (paraphrase):
Basically, you should be able to generate
subset
random indices and then pluck them from the original list.The Method
If you wanted to be even more efficient, you would probably use a
HashSet
of the indices, not the actual list elements (in case you've got complex types or expensive comparisons);The Unit Test
And to make sure we don't have any collisions, etc.
I combined several of the above answers to create a Lazily-evaluated extension method. My testing showed that Kyle's approach (Order(N)) is many times slower than drzaus' use of a set to propose the random indices to choose (Order(K)). The former performs many more calls to the random number generator, plus iterates more times over the items.
The goals of my implementation were:
1) Do not realize the full list if given an IEnumerable that is not an IList. If I am given a sequence of a zillion items, I do not want to run out of memory. Use Kyle's approach for an on-line solution.
2) If I can tell that it is an IList, use drzaus' approach, with a twist. If K is more than half of N, I risk thrashing as I choose many random indices again and again and have to skip them. Thus I compose a list of the indices to NOT keep.
3) I guarantee that the items will be returned in the same order that they were encountered. Kyle's algorithm required no alteration. drzaus' algorithm required that I not emit items in the order that the random indices are chosen. I gather all the indices into a SortedSet, then emit items in sorted index order.
4) If K is large compared to N and I invert the sense of the set, then I enumerate all items and test if the index is not in the set. This means that I lose the Order(K) run time, but since K is close to N in these cases, I do not lose much.
Here is the code:
I use a specialized random number generator, but you can just use C#'s Random if you want. (FastRandom was written by Colin Green and is part of SharpNEAT. It has a period of 2^128-1 which is better than many RNGs.)
Here are the unit tests:
Extending from @ers's answer, if one is worried about possible different implementations of OrderBy, this should be safe: