Trampoline recursion stack overflow

2019-07-29 15:35发布

问题:

I have this recursive function sum which computes sum of all numbers that were passed to it.

function sum(num1, num2, ...nums) {
  if (nums.length === 0) { return num1 + num2; }
  return sum(num1 + num2, ...nums);
}

let xs = [];
for (let i = 0; i < 100; i++) { xs.push(i); }
console.log(sum(...xs));

xs = [];
for (let i = 0; i < 10000; i++) { xs.push(i); }
console.log(sum(...xs));

It works fine if only 'few' numbers are passed to it but overflows call stack otherwise. So I have tried to modify it a bit and use trampoline so that it can accept more arguments.

function _sum(num1, num2, ...nums) {
  if (nums.length === 0) { return num1 + num2; }
  return () => _sum(num1 + num2, ...nums);
}

const trampoline = fn => (...args) => {
  let res = fn(...args);
  while (typeof res === 'function') { res = res(); }
  return res;
}

const sum = trampoline(_sum);

let xs = [];
for (let i = 0; i < 10000; i++) { xs.push(i); }
console.log(sum(...xs));

xs = [];
for (let i = 0; i < 100000; i++) { xs.push(i); }
console.log(sum(...xs));

While the first version isn't able to handle 10000 numbers, the second is. But if I pass 100000 numbers to the second version I'm getting call stack overflow error again.

I would say that 100000 is not really that big of a number (might be wrong here) and don't see any runaway closures that might have caused the memory leak.

Does anyone know what is wrong with it?

回答1:

Browsers have practical limits on the number of arguments a function can take

You can change the sum signature to accept an array rather than a varying number of arguments, and use destructuring to keep the syntax/readability similar to what you have. This "fixes" the stackoverflow error, but is increadibly slow :D

function _sum([num1, num2, ...nums]) { /* ... */ }

I.e.:, if you're running in to problems with maximum argument counts, your recursive/trampoline approach is probably going to be too slow to work with...



回答2:

The other answer points out the limitation on number of function arguments, but I wanted to remark on your trampoline implementation. The long computation we're running may want to return a function. If you use typeof res === 'function', it's not longer possible to compute a function as a return value!

Instead, encode your trampoline variants with some sort of unique identifiers

const bounce = (f, ...args) =>
  ({ tag: bounce, f: f, args: args })

const done = (value) =>
  ({ tag: done, value: value })

const trampoline = t =>
{ while (t && t.tag === bounce)
    t = t.f (...t.args)
  if (t && t.tag === done)
    return t.value
  else
    throw Error (`unsupported trampoline type: ${t.tag}`)
}

Before we hop on, let's first get an example function to fix

const none =
  Symbol ()

const badsum = ([ n1, n2 = none, ...rest ]) =>
  n2 === none
    ? n1
    : badsum ([ n1 + n2, ...rest ])

We'll throw a range of numbers at it to see it work

const range = n =>
  Array.from
    ( Array (n + 1)
    , (_, n) => n
    )

console.log (range (10))
// [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]

console.log (badsum (range (10)))
// 55

But can it handle the big leagues?

console.log (badsum (range (1000)))
// 500500

console.log (badsum (range (20000)))
// RangeError: Maximum call stack size exceeded

See the results in your browser so far

const none =
  Symbol ()

const badsum = ([ n1, n2 = none, ...rest ]) =>
  n2 === none
    ? n1
    : badsum ([ n1 + n2, ...rest ])

const range = n =>
  Array.from
    ( Array (n + 1)
    , (_, n) => n
    )

console.log (range (10))
// [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]

console.log (badsum (range (1000)))
// 500500

console.log (badsum (range (20000)))
// RangeError: Maximum call stack size exceeded

Somewhere between 10000 and 20000 our badsum function unsurprisingly causes a stack overflow.

Besides renaming the function to goodsum we only have to encode the return types using our trampoline's variants

const goodsum = ([ n1, n2 = none, ...rest ]) =>
  n2 === none
    ? n1
    ? done (n1)
    : goodsum ([ n1 + n2, ...rest ])
    : bounce (goodsum, [ n1 + n2, ...rest ])

console.log (trampoline (goodsum (range (1000))))
// 500500

console.log (trampoline (goodsum (range (20000))))
// 200010000
// No more stack overflow!

You can see the results of this program in your browser here. Now we can see that neither recursion nor the trampoline are at fault for this program being slow. Don't worry though, we'll fix that later.

const bounce = (f, ...args) =>
  ({ tag: bounce, f: f, args: args })
  
const done = (value) =>
  ({ tag: done, value: value })
  
const trampoline = t =>
{ while (t && t.tag === bounce)
    t = t.f (...t.args)
  if (t && t.tag === done)
    return t.value
  else
    throw Error (`unsupported trampoline type: ${t.tag}`)
}

const none =
  Symbol ()

const range = n =>
  Array.from
    ( Array (n + 1)
    , (_, n) => n
    )

const goodsum = ([ n1, n2 = none, ...rest ]) =>
  n2 === none
    ? done (n1)
    : bounce (goodsum, [ n1 + n2, ...rest ])

console.log (trampoline (goodsum (range (1000))))
// 500500

console.log (trampoline (goodsum (range (20000))))
// 200010000
// No more stack overflow!

The extra call to trampoline can get annoying, and when you look at goodsum alone, it's not immediately apparent what done and bounce are doing there, unless maybe this was a very common convention in many of your programs.

We can better encode our looping intentions with a generic loop function. A loop is given a function that is recalled whenever the function calls recur. It looks like a recursive call, but really recur is constructing a value that loop handles in a stack-safe way.

The function we give to loop can have any number of parameters, and with default values. This is also convenient because we can now avoid the expensive ... destructuring and spreading by simply using an index parameter i initialized to 0. The caller of the function does not have the ability to access these variables outside of the loop call

The last advantage here is that the reader of goodsum can clearly see the loop encoding and the explicit done tag is no longer necessary. The user of the function does not need to worry about calling trampoline either as it's already taken care of for us in loop

const goodsum = (ns = []) =>
  loop ((sum = 0, i = 0) =>
    i >= ns.length
      ? sum
      : recur (sum + ns[i], i + 1))

console.log (goodsum (range (1000)))
// 500500

console.log (goodsum (range (20000)))
// 200010000

console.log (goodsum (range (999999)))
// 499999500000

Here's our loop and recur pair now. This time we expand upon our { tag: ... } convention using a tagging module

const recur = (...values) =>
  tag (recur, { values })

const loop = f =>
{ let acc = f ()
  while (is (recur, acc))
    acc = f (...acc.values)
  return acc
}

const T =
  Symbol ()

const tag = (t, x) =>
  Object.assign (x, { [T]: t })

const is = (t, x) =>
  t && x[T] === t

Run it in your browser to verify the results

const T =
  Symbol ()

const tag = (t, x) =>
  Object.assign (x, { [T]: t })

const is = (t, x) =>
  t && x[T] === t

const recur = (...values) =>
  tag (recur, { values })

const loop = f =>
{ let acc = f ()
  while (is (recur, acc))
    acc = f (...acc.values)
  return acc
}

const range = n =>
  Array.from
    ( Array (n + 1)
    , (_, n) => n
    )

const goodsum = (ns = []) =>
  loop ((sum = 0, i = 0) =>
    i >= ns.length
      ? sum
      : recur (sum + ns[i], i + 1))

console.log (goodsum (range (1000)))
// 500500

console.log (goodsum (range (20000)))
// 200010000

console.log (goodsum (range (999999)))
// 499999500000

extra

My brain has been stuck in anamorphism gear for a few months and I was curious if it was possible to implement a stack-safe unfold using the loop function introduced above

Below, we look at an example program which generates the entire sequence of sums up to n. Think of it as showing the work to arrive at the answer for the goodsum program above. The total sum up to n is the last element in the array.

This is a good use case for unfold. We could write this using loop directly, but the point of this was to stretch the limits of unfold so here goes

const sumseq = (n = 0) =>
  unfold
    ( (loop, done, [ m, sum ]) =>
        m > n
          ? done ()
          : loop (sum, [ m + 1, sum + m ])
    , [ 1, 0 ]
    )

console.log (sumseq (10))
// [ 0,   1,   3,   6,   10,  15,  21,  28,  36, 45 ]
//   +1 ↗ +2 ↗ +3 ↗ +4 ↗ +5 ↗ +6 ↗ +7 ↗ +8 ↗ +9 ↗  ...

If we used an unsafe unfold implementation, we could blow the stack

// direct recursion, stack-unsafe!
const unfold = (f, initState) =>
  f ( (x, nextState) => [ x, ...unfold (f, nextState) ]
    , () => []
    , initState
    )

console.log (sumseq (20000))
// RangeError: Maximum call stack size exceeded

After playing with it a little bit, it is indeed possible to encode unfold using our stack-safe loop. Cleaning up the ... spread syntax using a push effect makes things a lot quicker too

const push = (xs, x) =>
  (xs .push (x), xs)

const unfold = (f, init) =>
  loop ((acc = [], state = init) =>
    f ( (x, nextState) => recur (push (acc, x), nextState)
      , () => acc
      , state
      ))

With a stack-safe unfold, our sumseq function works a treat now

console.time ('sumseq')
const result = sumseq (20000)
console.timeEnd ('sumseq')

console.log (result)
// sumseq: 23 ms
// [ 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ..., 199990000 ]

Verify the result in your browser below

const recur = (...values) =>
  tag (recur, { values })

const loop = f =>
{ let acc = f ()
  while (is (recur, acc))
    acc = f (...acc.values)
  return acc
}

const T =
  Symbol ()

const tag = (t, x) =>
  Object.assign (x, { [T]: t })

const is = (t, x) =>
  t && x[T] === t

const push = (xs, x) =>
  (xs .push (x), xs)

const unfold = (f, init) =>
  loop ((acc = [], state = init) =>
    f ( (x, nextState) => recur (push (acc, x), nextState)
      , () => acc
      , state
      ))

const sumseq = (n = 0) =>
  unfold
    ( (loop, done, [ m, sum ]) =>
        m > n
          ? done ()
          : loop (sum, [ m + 1, sum + m ])
    , [ 1, 0 ]
    )

console.time ('sumseq')
const result = sumseq (20000)
console.timeEnd ('sumseq')

console.log (result)
// sumseq: 23 ms
// [ 0, 1, 3, 6, 10, 15, 21, 28, 36, 45, ..., 199990000 ]



回答3:

The other answer already explained the issue with your code. This answer demonstrates that trampolines are sufficiently fast for most array based computations and offer a higher level of abstraction:

// trampoline

const loop = f => {
  let acc = f();

  while (acc && acc.type === recur)
    acc = f(...acc.args);

  return acc;
};


const recur = (...args) =>
  ({type: recur, args});


// sum

const sum = xs => {
  const len = xs.length;

  return loop(
    (acc = 0, i = 0) =>
      i === len
        ? acc
        : recur(acc + xs[i], i + 1));
};


// and run...

const xs = Array(1e5)
  .fill(0)
  .map((x, i) => i);


console.log(sum(xs));

If a trampoline based computation causes performance problems, then you can still replace it with a bare loop.