I am trying to write a simple sieve function to calculate prime numbers in clojure. I've seen this question about writing an efficient sieve function, but I am not to that point yet. Right now I am just trying to write a very simple (and slow) sieve. Here is what I have come up with:
(defn sieve [potentials primes]
(if-let [p (first potentials)]
(recur (filter #(not= (mod % p) 0) potentials) (conj primes p))
primes))
For small ranges it works fine, but causes a stack overflow for large ranges:
user=> (sieve (range 2 30) [])
[2 3 5 7 11 13 17 19 23 29]
user=> (sieve (range 2 15000) [])
java.lang.StackOverflowError (NO_SOURCE_FILE:0)
I thought that by using recur
this would be a non-stack-consuming looping construct? What am I missing?
Algorithmically the problem is that you continue filtering when there's no more purpose to it. Stopping as early as possible achieves quadratic reduction in recursion depth (
sqrt(n)
vs.n
):Runs OK for 16,000 (performing just 30 iterations instead of 1862), and for 160,000 too, on ideone. Even runs 5% faster without the
doall
.You're being hit by
filter
's laziness. Change(filter ...)
to(doall (filter ...))
in yourrecur
form and the problem should go away.A more in-depth explanation:
The call to
filter
returns a lazy seq, which materialises actual elements of the filtered seq as required. As written, your code stacksfilter
uponfilter
uponfilter
..., adding one more level offilter
ing at each iteration; at some point this blows up. The solution is to force the whole result at each iteration so that the next one will do its filtering on a fully realised seq and return a fully realised seq instead of adding an extra layer of lazy seq processing; that's whatdoall
does.