Fastest way to calculate primes in C#?

2019-03-13 06:44发布

I actually have an answer to my question but it is not parallelized so I am interested in ways to improve the algorithm. Anyway it might be useful as-is for some people.

int Until = 20000000;
BitArray PrimeBits = new BitArray(Until, true);

/*
 * Sieve of Eratosthenes
 * PrimeBits is a simple BitArray where all bit is an integer
 * and we mark composite numbers as false
 */

PrimeBits.Set(0, false); // You don't actually need this, just
PrimeBits.Set(1, false); // remindig you that 2 is the smallest prime

for (int P = 2; P < (int)Math.Sqrt(Until) + 1; P++)
    if (PrimeBits.Get(P))
        // These are going to be the multiples of P if it is a prime
        for (int PMultiply = P * 2; PMultiply < Until; PMultiply += P)
            PrimeBits.Set(PMultiply, false);

// We use this to store the actual prime numbers
List<int> Primes = new List<int>();

for (int i = 2; i < Until; i++)
    if (PrimeBits.Get(i))
        Primes.Add(i);

Maybe I could use multiple BitArrays and BitArray.And() them together?

8条回答
ゆ 、 Hurt°
2楼-- · 2019-03-13 07:15

Parallelisation aside, you don't want to be calculating sqrt(Until) on every iteration. You also can assume multiples of 2, 3 and 5 and only calculate for N%6 in {1,5} or N%30 in {1,7,11,13,17,19,23,29}.

You should be able to parallelize the factoring algorithm quite easily, since the Nth stage only depends on the sqrt(n)th result, so after a while there won't be any conflicts. But that's not a good algorithm, since it requires lots of division.

You should also be able to parallelize the sieve algorithms, if you have writer work packets which are guaranteed to complete before a read. Mostly the writers shouldn't conflict with the reader - at least once you've done a few entries, they should be working at least N above the reader, so you only need a synchronized read fairly occasionally (when N exceeds the last synchronized read value). You shouldn't need to synchronize the bool array across any number of writer threads, since write conflicts don't arise (at worst, more than one thread will write a true to the same place).

The main issue would be to ensure that any worker being waited on to write has completed. In C++ you'd use a compare-and-set to switch to the worker which is being waited for at any point. I'm not a C# wonk so don't know how to do it that language, but the Win32 InterlockedCompareExchange function should be available.

You also might try an actor based approach, since that way you can schedule the actors working with the lowest values, which may be easier to guarantee that you're reading valid parts of the the sieve without having to lock the bus on each increment of N.

Either way, you have to ensure that all workers have got above entry N before you read it, and the cost of doing that is where the trade-off between parallel and serial is made.

查看更多
Bombasti
3楼-- · 2019-03-13 07:15

Without profiling we cannot tell which bit of the program needs optimizing.

If you were in a large system, then one would use a profiler to find that the prime number generator is the part that needs optimizing.

Profiling a loop with a dozen or so instructions in it is not usually worth while - the overhead of the profiler is significant compared to the loop body, and about the only ways to improve a loop that small is to change the algorithm to do fewer iterations. So IME, once you've eliminated any expensive functions and have a known target of a few lines of simple code, you're better off changing the algorithm and timing an end-to-end run than trying to improve the code by instruction level profiling.

查看更多
登录 后发表回答