IE,
What am I doing wrong here? Does it have to to with lists, sequences and arrays and the way the limitations work?
So here is the setup: I'm trying to generate some primes. I see that there are a billion text files of a billion primes. The question isn't why...the question is how are the guys using python calculating all of the primes below 1,000,000 in milliseconds on this post...and what am I doing wrong with the following F# code?
let sieve_primes2 top_number =
let numbers = [ for i in 2 .. top_number do yield i ]
let sieve (n:int list) =
match n with
| [x] -> x,[]
| hd :: tl -> hd, List.choose(fun x -> if x%hd = 0 then None else Some(x)) tl
| _ -> failwith "Pernicious list error."
let rec sieve_prime (p:int list) (n:int list) =
match (sieve n) with
| i,[] -> i::p
| i,n' -> sieve_prime (i::p) n'
sieve_prime [1;0] numbers
With the timer on in FSI, I get 4.33 seconds worth of CPU for 100000... after that, it all just blows up.
There have been several good answers both as to general trial division algorithm using lists (@pad) and in choice of an array for a sieving data structure using the Sieve of Eratosthenes (SoE) (@John Palmer and @Jon Harrop). However, @pad's list algorithm isn't particularly fast and will "blow up" for larger sieving ranges and @John Palmer's array solution is somewhat more complex, uses more memory than necessary, and uses external mutable state so is not different than if the program were written in an imperative language such as C#.
EDIT_ADD: I've edited the below code (old code with line comments) modifying the sequence expression to avoid some function calls so as to reflect more of an "iterator style" and while it saved 20% of the speed it still doesn't come close to that of a true C# iterator which is about the same speed as the "roll your own enumerator" final F# code. I've modified the timing information below accordingly. END_EDIT
The following true SoE program only uses 64 KBytes of memory to sieve primes up to a million (due to only considering odd numbers and using the packed bit BitArray) and still is almost as fast as @John Palmer's program at about 40 milliseconds to sieve to one million on a i7 2700K (3.5 GHz), with only a few lines of code:
Almost all of the elapsed time is spent in the last two lines enumerating the found primes due to the inefficiency of the sequence run time library as well as the cost of enumerating itself at about 28 clock cycles per function call and return with about 16 function calls per iteration. This could be reduced to only a few function calls per iteration by rolling our own iterator, but the code is not as concise; note that in the following code there is no mutable state exposed other than the contents of the sieving array and the reference variable necessary for the iterator implementation using object expressions:
The above code takes about 8.5 milliseconds to sieve the primes to a million on the same machine due to greatly reducing the number of function calls per iteration to about three from about 16. This is about the same speed as C# code written in the same style. It's too bad that F#'s iterator style as I used in the first example doesn't automatically generate the IEnumerable boiler plate code as C# iterators do, but I guess that is the intention of sequences - just that they are so damned inefficient as to speed performance due to being implemented as sequence computation expressions.
Now, less than half of the time is expended in enumerating the prime results for a much better use of CPU time.
Your sieve function is slow because you tried to filter out composite numbers up to
top_number
. With Sieve of Eratosthenes, you only need to do so untilsqrt(top_number)
and remaining numbers are inherently prime. Suppose we havetop_number = 1,000,000
, your function does78498
rounds of filtering (the number of primes until1,000,000
) while the original sieve only does so168
times (the number of primes until1,000
).You can avoid generating even numbers except 2 which cannot be prime from the beginning. Moreover,
sieve
andsieve_prime
can be merged into a recursive function. And you could use lightweightList.filter
instead ofList.choose
.Incorporating above suggestions:
In my machine, the updated version is very fast and it completes within 0.6s for
top_number = 1,000,000
.You've implemented a different algorithm that goes through each possible value and uses
%
to determine if it needs to be removed. What you're supposed to be doing is stepping through with a fixed increment removing multiples. That would be asymptotically.You cannot step through lists efficiently because they don't support random access so use arrays.
Based on my code here: stackoverflow.com/a/8371684/124259
Gets the first 1 million primes in 22 milliseconds in fsi - a significant part is probably compiling the code at this point.