Recursive function causing a stack overflow

2019-01-03 03:29发布

I am trying to write a simple sieve function to calculate prime numbers in clojure. I've seen this question about writing an efficient sieve function, but I am not to that point yet. Right now I am just trying to write a very simple (and slow) sieve. Here is what I have come up with:

(defn sieve [potentials primes]
  (if-let [p (first potentials)]
    (recur (filter #(not= (mod % p) 0) potentials) (conj primes p))
    primes))

For small ranges it works fine, but causes a stack overflow for large ranges:

user=> (sieve (range 2 30) [])
[2 3 5 7 11 13 17 19 23 29]
user=> (sieve (range 2 15000) [])
java.lang.StackOverflowError (NO_SOURCE_FILE:0)

I thought that by using recur this would be a non-stack-consuming looping construct? What am I missing?

2条回答
太酷不给撩
2楼-- · 2019-01-03 03:50

Algorithmically the problem is that you continue filtering when there's no more purpose to it. Stopping as early as possible achieves quadratic reduction in recursion depth (sqrt(n) vs. n):

(defn sieve [potentials primes]    
  (if-let [p (first potentials)]
      (if (> (* p p) (last potentials))
        (concat primes potentials)
        (recur (filter (fn [n] (not= (mod n p) 0)) potentials)
               (conj primes p)))
    primes))

Runs OK for 16,000 (performing just 30 iterations instead of 1862), and for 160,000 too, on ideone. Even runs 5% faster without the doall.

查看更多
欢心
3楼-- · 2019-01-03 03:59

You're being hit by filter's laziness. Change (filter ...) to (doall (filter ...)) in your recur form and the problem should go away.

A more in-depth explanation:

The call to filter returns a lazy seq, which materialises actual elements of the filtered seq as required. As written, your code stacks filter upon filter upon filter..., adding one more level of filtering at each iteration; at some point this blows up. The solution is to force the whole result at each iteration so that the next one will do its filtering on a fully realised seq and return a fully realised seq instead of adding an extra layer of lazy seq processing; that's what doall does.

查看更多
登录 后发表回答