I need a quick algorithm to select 5 random elements from a generic list. For example, I'd like to get 5 random elements from a List<string>
.
相关问题
- Sorting 3 numbers without branching [closed]
- Graphics.DrawImage() - Throws out of memory except
- Why am I getting UnauthorizedAccessException on th
- 求获取指定qq 资料的方法
- How to know full paths to DLL's from .csproj f
The simple solution I use (probably not good for large lists): Copy the list into temporary list, then in loop randomly select Item from temp list and put it in selected items list while removing it form temp list (so it can't be reselected).
Example:
why not something like this:
Iterate through and for each element make the probability of selection = (number needed)/(number left)
So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.
I just ran into this problem, and some more google searching brought me to the problem of randomly shuffling a list: http://en.wikipedia.org/wiki/Fisher-Yates_shuffle
To completely randomly shuffle your list (in place) you do this:
To shuffle an array a of n elements (indices 0..n-1):
If you only need the first 5 elements, then instead of running i all the way from n-1 to 1, you only need to run it to n-5 (ie: n-5)
Lets say you need k items,
This becomes:
Each item that is selected is swapped toward the end of the array, so the k elements selected are the last k elements of the array.
This takes time O(k), where k is the number of randomly selected elements you need.
Further, if you don't want to modify your initial list, you can write down all your swaps in a temporary list, reverse that list, and apply them again, thus performing the inverse set of swaps and returning you your initial list without changing the O(k) running time.
Finally, for the real stickler, if (n == k), you should stop at 1, not n-k, as the randomly chosen integer will always be 0.
Based on Kyle's answer, here's my c# implementation.
Here you have one implementation based on Fisher-Yates Shuffle whose algorithm complexity is O(n) where n is the subset or sample size, instead of the list size, as John Shedletsky pointed out.