Random generates number 1 more than 90% of times i

2019-06-14 21:59发布

This question already has an answer here:

Consider the following program:

public class Program
{
     private static Random _rnd = new Random();
     private static readonly int ITERATIONS = 5000000;
     private static readonly int RANDOM_MAX = 101;

     public static void Main(string[] args)
     {
          ConcurrentDictionary<int,int> dic = new ConcurrentDictionary<int,int>();

          Parallel.For(0, ITERATIONS, _ => dic.AddOrUpdate(_rnd.Next(1, RANDOM_MAX), 1, (k, v) => v + 1));

          foreach(var kv in dic)
             Console.WriteLine("{0} -> {1:0.00}%", kv.Key, ((double)kv.Value / ITERATIONS) * 100);
     }
}

This will print the following output:

(Note that the output will vary on each execution)

> 1 -> 97,38%
> 2 -> 0,03%
> 3 -> 0,03%
> 4 -> 0,03%
...
> 99 -> 0,03%
> 100 -> 0,03%

Why is the number 1 generated with such frequency?

4条回答
聊天终结者
2楼-- · 2019-06-14 22:33

Well, Random class is not thread safe one, the easiest way out is to make it thread local (each thread has its own Random instance):

private static ThreadLocal<Random> m_rnd = new ThreadLocal<Random>(() => new Random());

private static Random _rnd {
  get {
    return m_rnd.Value;
  }
}

https://msdn.microsoft.com/en-us/library/system.random(v=vs.110).aspx#ThreadSafety

查看更多
We Are One
3楼-- · 2019-06-14 22:42

Random is not thread-safe - you must not use the same Random instance from multiple threads without synchronization.

Why are you getting 1 in particular? Well, the way Random works (in 4.5.2) is by keeping a seed array as well as two indexers. When you use it from multiple threads concurrently, your seed array is going to get all messed up, and you'll almost always get identical values in multiple slots. The basic operation does something like seed[a] - seed[b], and when those values are the same, you get zero back. Since you asked for 1 as a minimum, this zero is shifted to one - and there's your anomaly. This happens very quickly in a multi-core environment, since there's quite a lot of inter-dependent state that's updated on each Next call.

There's many ways to solve this problem. One is to synchronize access to a common Random instance - it only makes sense if you're doing relatively few randoms, though, in which case you wouldn't use Parallel anyway. If performance is a concern, you'll either need to add some form of pre-fetching (e.g. preparing the random numbers in batches, per-thread or using some concurrent queue), or use some other method.

Another way is to keep a separate Random instance for each thread. This requires you to carefuly select a seed for each of the instances, though, otherwise your random numbers may end up being quite non-random. The approach used in .NET itself (again, using 4.5.2 code for reference) is to use Thread.CurrentThread.ManagedThreadId as the seed, which works quite well. Another common approach is to use a single global (synchronized) Random instance to initialize the seeds of the other Randoms, but depending on your requirements, you may need to ensure no duplicate seeds are produced.

Of course, you could also use some other random number generator. However, pseudo-random generators will usually require the same approaches as Random - they heavily depend on their state; that's what makes them pseudo-random in the first place. A cryptographic generator may work better, but those tend to be very slow, and may fallback to the synchronized approach anyway, especially if there's no hardware support.

In some cases, it makes sense to distribute the generation work according to some reasonable rules. For example, if you use pseudo-random procedural generation for in-game assets, it may make sense to make explicit rules to how different generators are seeded, in a repeatable manner - of course, this also means you can't really use Parallel either, and you'll have to be a bit more explicit.

查看更多
何必那么认真
4楼-- · 2019-06-14 22:44

Random is not thread safe.

Next does nothing special at all to ensure thread safety.

Don't use Random like this. And don't consider using thread local storage duration either, else you will mess up the generator's statistical properties: You must only use one Random instance. One approach will be to use lock(_global) and draw a number in that locked region.

I think what's happening here is that the first thread to reach the generator gets its random numbers generated correctly, and all subsequent threads receive 0 for each drawing. With a "parallelisation" thread pool of 32 threads, the ratios you cite above are approximately attainted; assuming that the results for the 31 threads are placed in the first bucket.

查看更多
Emotional °昔
5楼-- · 2019-06-14 22:53

Going one step further from the thread local storage solution, and trying to avoid the statistical issue, I suggest to use a random seed generated from RNGCryptoServiceProvider:

using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApplication1
{
    class Program
    {

        private static readonly int ITERATIONS = 5000000;
        private static readonly int RANDOM_MAX = 101;

        private static int GetCriptoRandom()
        {
            using (var rng = new System.Security.Cryptography.RNGCryptoServiceProvider())
            {
                byte[] bytes = new byte[4];
                rng.GetBytes(bytes);
                return BitConverter.ToInt32(bytes, 0);
            }
        }

        private static ThreadLocal<Random> m_rnd = new ThreadLocal<Random>(() => new Random(GetCryptoRandom()));

        private static Random _rnd
        {
            get
            {
                return m_rnd.Value;
            }
        }

        static void Main(string[] args)
        {
            ConcurrentDictionary<int, int> dic = new ConcurrentDictionary<int, int>();
            Parallel.For(1, ITERATIONS, _ => dic.AddOrUpdate(_rnd.Next(1, RANDOM_MAX), 1, (k, v) => v + 1));
            foreach (var kv in dic)
                Console.WriteLine("{0} -> {1:0.00}%", kv.Key, ((double)kv.Value / ITERATIONS) * 100);

        }
    }
}

It seems statistically correct, results range from 0.99% to 1.01%.

查看更多
登录 后发表回答