How to avoid stackoverflow in clojure recursive fu

2019-07-20 04:10发布

Here is an example:

;; Helper function for marking multiples of a number as 0
(def mark (fn [[x & xs] k m]
                 (if (= k m) 
                   (cons 0 (mark xs 1       m))
                   (cons x (mark xs (inc k) m))
                   )))

;; Sieve of Eratosthenes
(defn sieve
  [x & xs]
  (if (= x 0)
    (sieve xs)
    (cons x (sieve (mark xs 1 x)))
    ))

(take 10 (lazy-seq (sieve (iterate inc 2))))

It produces a StackOverflowError.

3条回答
狗以群分
2楼-- · 2019-07-20 04:52

There are a couple of issues here. First, as pointed out in the other answer, your mark and sieve functions don't have terminating conditions. It looks like they are designed to work with infinite sequences, but if you passed a finite-length sequence they'd keep going off the end.

The deeper problem here is that it looks like you're trying to have a function create a lazy infinite sequence by recursively calling itself. However, cons is not lazy in any way; it is a pure function call, so the recursive calls to mark and sieve are invoked immediately. Wrapping the outer-most call to sieve in lazy-seq only serves to defer the initial call; it does not make the entire sequence lazy. Instead, each call to cons must be wrapped in its own lazy sequence.

For instance:

(defn eager-iterate [f x]
  (cons x (eager-iterate f (f x))))

(take 3 (eager-iterate inc 0)) ; => StackOverflowError
(take 3 (lazy-seq (eager-iterate inc 0))) ; => Still a StackOverflowError

Compare this with the actual source code of iterate:

(defn iterate
  "Returns a lazy sequence of x, (f x), (f (f x)) etc. f must be free of side-effects"
  {:added "1.0"
   :static true}
  [f x] (cons x (lazy-seq (iterate f (f x)))))

Putting it together, here's an implementation of mark that works correctly for finite sequences and preserves laziness for infinite sequences. Fixing sieve is left as an exercise for the reader.

(defn mark [[x :as xs] k m]
  (lazy-seq
    (when (seq xs)
      (if (= k m)
        (cons 0 (mark (next xs) 1 m))
        (cons x (mark (next xs) (inc k) m))))))

(mark (range 4 14) 1 3)
; => (4 5 0 7 8 0 10 11 0 13)
(take 10 (mark (iterate inc 4) 1 3))
; => (4 5 0 7 8 0 10 11 0 13)
查看更多
贼婆χ
3楼-- · 2019-07-20 04:53

Thanks to @Alex's answer I managed to come up with a working lazy solution:

;; Helper function for marking mutiples of a number as 0
(defn mark [[x :as xs] k m]
  (lazy-seq
    (when-not (empty? xs)
      (if (= k m)
        (cons 0 (mark (rest xs) 1 m))
        (cons x (mark (rest xs) (inc k) m))))))


;; Sieve of Eratosthenes
(defn sieve
  [[x :as xs]]
  (lazy-seq
    (when-not (empty? xs)
      (if (= x 0)
        (sieve (rest xs))
        (cons x (sieve (mark (rest xs) 1 x)))))))

I was adviced by someone else to use rest instead of next.

查看更多
一纸荒年 Trace。
4楼-- · 2019-07-20 04:57

Need terminating conditions

The problem here is both your mark and sieve functions have no terminating conditions. There must be some set of inputs for which each function does not call itself, but returns an answer. Additionally, every set of (valid) inputs to these functions should eventually resolve to a non-recursive return value.

But even if you get it right...

I'll add that even if you succeed in creating the correct terminating conditions, there is still the possibility of having a stack overflow if the depth of the recursion in too large. This can be mitigated to some extent by increasing the JVM stack size, but this has it's limits.

A way around this for some functions is to use tail call optimization. Some recursive functions are tail recursive, meaning that all recursive calls to the function being defined within it's definition are in the tail call position (are the final function called in the definition body). For example, in your sieve function's (= x 0) case, sieve is the tail call, since the result of sieve doesn't get passed into any other function. However, in the case that (not (= x 0)), the result of calling sieve gets passed to cons, so this is not a tail call. When a function is fully tail recursive, it is possible to behind the scenes transform the function definition into a looping construct which avoids consuming the stack. In clojure this is possible by using recur in the function definition instead of the function name (there is also a loop construct which can sometimes be helpful). Again, because not all recursive functions are tail recursive, this isn't a panacea. But when they are it's good to know that you can do this.

查看更多
登录 后发表回答