I am learning Lisp from the book "The Land of Lisp" by Conrad Barski. Now I have hit my first stumbling block, where the author says:
Calling yourself in this way is not only allowed in Lisp, but is often
strongly encouraged
after showing the following example function to count the items in a list:
(defun my-length (list)
(if list
(1+ (my-length (cdr list)))
0))
When I call this function my-length
with a list containing a million items, I get a stack overflow error. So either you never expect to have a list that long in Lisp (so maybe my use case is useless) or there is another way to count items in such a long list. Can you maybe shine some light on this? (I am using GNU CLISP on Windows, by the way).
Creating recursive functions to operate on recursive datastructures is indeed good for in lisp. And a list (in lisp) is defined as a recursive datastructure, so you should be ok.
However, as you have experienced, if traversing a datastructure a million items deep using recursion, will also allocate a million frames on the stack, and you can expect a stack overflow unless you specifically ask your runtime environment to allocate huge amount of stack-space (I have no idea if or how you could do this in gnu clisp...).
First of all, I think this shows that the list-datastructure isn't really a the best for everything, and in this case another structure might be better (however, you might not have come to vectors in your lisp-book yet ;-)
Another thing is that for large recursions such as this to be effective, the compiler should optimise tail recursions and convert them into iterations. I don't know if clisp has this functionality, but you would need to change your function to actually be optimisable. (If "tail recursion optimisation" makes no sense, please let me know and I can dig up some references)
For other ways of iterating, take a look at:
- http://www.lispworks.com/documentation/HyperSpec/Body/c_iterat.htm
- http://www.lispworks.com/documentation/HyperSpec/Body/06_a.htm
Or other datastructures:
- http://www.lispworks.com/documentation/HyperSpec/Body/t_vector.htm
TCO (tail call optimization) in CLISP using the example from Chris Taylor:
[1]> (defun helper (acc list)
(if list
(helper (1+ acc) (cdr list))
acc))
(defun my-length (list)
(helper 0 list))
HELPER
Now compile it:
[2]> (compile 'helper)
MY-LENGTH
[3]> (my-length (loop repeat 100000 collect t))
*** - Program stack overflow. RESET
Now, above does not work. Let's set the debug level low. This allows the compiler to do TCO.
[4]> (proclaim '(optimize (debug 1)))
NIL
Compile again.
[5]> (compile 'helper)
HELPER ;
NIL ;
NIL
[6]> (my-length (loop repeat 100000 collect t))
100000
[7]>
Works.
Allowing the Common Lisp compiler to do TCO is most often controlled by the debug level. With a high debug level, the compiler generates code which uses a stack frame for each function call. This way each call can be traced and will be seen in a backtrace. With a lower debug level the compiler may replace tail calls with jumps in the compiled code. These calls then will not be seen in a backtrace - which usually makes debugging harder.
You're looking for tail recursion. At the moment your function is defined like
(defun my-length (list)
(if list
(1+ (my-length (cdr list)))
0))
Notice that after my-length
has called itself, it needs to add one to the result before passing that value to its calling function. This need to modify the value before returning it means that you need to allocate a new stack frame for every call, the the space used is proportional to the length of the list. This is what causes a stack overflow on long lists.
You can re-write it to use a helper function
(defun helper (acc list)
(if list
(helper (1+ acc) (cdr list))
acc))
(defun my-length (list)
(helper 0 list))
The helper function takes two parameters, an accumulation parameter acc
, which stores the length of the list so far, and a list list
which is the list we're computing the length of.
The important point is that helper
is written tail recursively, which means that calling itself is the last thing it does. This means you don't need to allocate a new stack frame every time the function calls itself - since the final result will just be passed all the way back up the chain of stack frames anyway, you can get away with overwriting the old stack frame with a new one, so your recursive function can operate in constant space.
This form of program transformation - from a recursive (but non-tail-recursive) definition to an equivalent definition using a tail-recursive helper function is an important idiom in functional programming - one that it's worth spending a bit of time understanding.
(DEFUN nrelem(l)
(if (null l)
0
(+ (nrelem (rest l)) 1)
))