RNGCryptoServiceProvider fail chi-square test on l

2019-03-31 19:21发布

问题:

Does any one know why the RNGCryptoServiceProvider fail chi-square test when trying to get numbers bigger then 300,000,000.

I tried to get random number in the range 0-1,000,000,000 and the result that I received fail chi-square test, the numbers in the range 0-300,000,000 appeared more than the other numbers.

eventually i combined the big number form to smaller numbers (0-99 *100M + 0-99,999,999) and the chi-square test pass.

can anyone explain this anomaly in big numbers?

I used the following code to get the numbers

    [Timeout(TestTimeout.Infinite), TestMethod]
    public void TestMethodStatistic()
    {
        Dictionary<long, long> appearances = new Dictionary<long, long>();
        UInt64 tenBillion = 10000000000;

        for (UInt64 i = 0; i < 10000000; i++)
        {
            UInt64 random = GetSIngleRandomNumberInternal() % tenBillion;
            UInt64 bucket = random /10000000;

            if (!appearances.ContainsKey(Convert.ToInt64(bucket)))
            {
                appearances.Add(Convert.ToInt64(bucket), 0);
            }
            appearances[Convert.ToInt64(bucket)]++;
        }
        string results = "\nBucket Id\tcount\n";
        foreach (var appearance in appearances)
        {
            results += appearance.Key+"\t"+ appearance.Value +"\n";
        }
        File.AppendAllText(@"C:\Result.txt",results);
    }

    private RNGCryptoServiceProvider rngCsp = new RNGCryptoServiceProvider();

    private UInt64 GetSIngleRandomNumberInternal()
    {
        byte[] randomNumBytes = new byte[sizeof(UInt64)];
        rngCsp.GetBytes(randomNumBytes);


        return BitConverter.ToUInt64(randomNumBytes, 0);
    }

Take the Result.txt file and copy the content to an excel. make it a table and add 2 columns 1 is the expected result with the value 100000 and the second one is the Chi-square test the value is "=CHISQ.TEST([count],[[expected ]])"

when the value of the chi-square test is less than 0.1 we have a problem.

回答1:

Most likely the problem is that you're introducing a bias when you use the remainder technique. See How much bias is introduced by the remainder technique? for an explanation.