C# Prime Generator, Maxxing out Bit Array

2019-04-15 23:21发布

问题:

(C#, prime generator) Heres some code a friend and I were poking around on:

public List<int> GetListToTop(int top)
{            
    top++;
    List<int> result = new List<int>();
    BitArray primes = new BitArray(top / 2);
    int root = (int)Math.Sqrt(top);
    for (int i = 3, count = 3; i <= root; i += 2, count++)
    {
        int n = i - count;
        if (!primes[n])
            for (int j = n + i; j < top / 2; j += i)
            {
                primes[j] = true;
            }
    }
    if (top >= 2)
        result.Add(2);            
    for (int i = 0, count = 3; i < primes.Length; i++, count++)
    {
        if (!primes[i])
        {
            int n = i + count;
            result.Add(n);
        }
    }

    return result;
}

On my dorky AMD x64 1800+ (dual core), for all primes below 1 billion in 34546.875ms. Problem seems to be storing more in the bit array. Trying to crank more than ~2billion is more than the bitarray wants to store. Any ideas on how to get around that?

回答1:

I would "swap" parts of the array out to disk. By that, I mean, divide your bit array into half-billion bit chunks and store them on disk.

The have only a few chunks in memory at any one time. With C# (or any other OO language), it should be easy to encapsulate the huge array inside this chunking class.

You'll pay for it with a slower generation time but I don't see any way around that until we get larger address spaces and 128-bit compilers.



回答2:

Or as an alternative approach to the one suggested by Pax, make use of the new Memory-Mapped File classes in .NET 4.0 and let the OS decide which chunks need to be in memory at any given time.

Note however that you'll want to try and optimise the algorithm to increase locality so that you do not needlessly end up swapping pages in and out of memory (trickier than this one sentence makes it sound).



回答3:

Use multiple BitArrays to increase the maximum size. If a number is to great bit-shift it and store the result in a bit-array for storing bits 33-64.

BitArray second = new BitArray(int.MaxValue);
long num = 23958923589;
if (num > int.MaxValue)
{
    int shifted = (int)num >> 32;
    second[shifted] = true;
}

long request = 0902305023;
if (request > int.MaxValue)
{
    int shifted = (int)request >> 32;
    return second[shifted];
}
else return first[request];

Of course it would be nice if BitArray would support size up to System.Numerics.BigInteger.
Swapping to disk will make your code really slow.
I have a 64-bit OS, and my BitArray is also limited to 32-bits.

PS: your prime number calculations looks wierd, mine looks like this:

for (int i = 2; i <= number; i++)
    if (primes[i])
        for (int scalar = i + i; scalar <= number; scalar += i)
        {
            primes[scalar] = false;
            yield return scalar;
        }


回答4:

The Sieve algorithm would be better performing. I could determine all the 32-bit primes (total about 105 million) for the int range in less than 4 minutes with that. Of course returning the list of primes is a different thing as the memory requirement there would be a little over 400 MB (1 int = 4 bytes). Using a for loop the numbers were printed to a file and then imported to a DB for more fun :) However for the 64 bit primes the program would need several modifications and perhaps require distributed execution over multiple nodes. Also refer to the following links

http://www.troubleshooters.com/codecorn/primenumbers/primenumbers.htm

http://en.wikipedia.org/wiki/Prime-counting_function