Why not implement `let` in terms of lexically clos

2019-06-26 08:45发布

问题:

I have been working with lisp-family languages for several years and feel like I have a pretty good grasp on them. I'm now writing my own lisp (as is the fashion, of course), but almost entirely avoiding re-implementing the same patterns that Scheme, Common Lisp and friends have used. One particular thing that I always found odd was all the variants of let (letrec, flet, labels, let*...).

Say in a "no legacy carried over from the past" implementation of a lisp, I'd like to just be able to write things like:

(let ((x 7)
      (y 8)
      (z (* x y)))
  (str "x * y * z = " (* x y z)))

And similarly I'd like to be able to write:

(let ((odd? (lambda (n) (if (zero? n) false (even? (dec n)))))
      (even? (lambda (n) (if (zero? n) true (odd? (dec n)))))
  (odd? 7))

I could implement both of those variants quite efficiently by just using define inside the body of a lambda. Excuse the clojure-scheme-hybrid code:

(defmacro let (bindings & body)
  `((lambda ()
      ,@(map (lambda (name-value) (cons 'define name-value)) bindings)
      ,@body)))

Because this just introduces the bindings within the closed environment in the body of the lambda, mutual recursion can work and variables can depend on the existence of previous definitions. Similarly, because a lambda closes the environment in which it is created, expanding the (define x expression) (define y expression) works as expected.

So why complicate matters by having numerous explicit forms? Am I overlooking something? I have implement my own let as I demonstrate above and it seems to work in all cases exactly as expected. This also reduces the cost overhead of nested function applications, such as in the case of let*.

回答1:

Internal definitions are defined (har har) by R5RS to use letrec, and by R6RS and R7RS to use letrec*. The behaviour you're describing is exactly what letrec* is.

However, there are cases where you want to use the outer bindings, and you don't want the inner bindings to shadow them during their definitions. In this case, let and let* are more appropriate than letrec and letrec*.

What do I mean by this? Here's one way that let provides outer scoping that letrec doesn't:

(let ((x 1)
      (y 2))
  (let ((x (+ x y))
        (y (- x y)))
    (format #t "x = ~a, y = ~a~%" x y)))

Here, in the (+ x y) and (- x y) expressions, we are using the outer bindings of x and y. Thus the inner x will be 3, and the inner y will be -1.

Using let* is similar except that the bindings are sequential:

(let ((x 1)
      (y 2))
  (let* ((x (+ x y))
         (y (- x y)))
    (format #t "x = ~a, y = ~a~%" x y)))

Here, the inner x is evaluated the same as for the let case, but the inner y's definition will use the inner x instead of the outer x (but it still uses the outer y, as the inner y hasn't been bound yet). Thus the inner x will be 3, and the inner y will be 1.

When using letrec and letrec*, all the outer bindings will be shadowed by the inner bindings of the same name, and you do not have access to the outer x or y in those cases. This property is what allows their use for self-referential functions and/or data structures.

letrec and letrec* are similar, except that for letrec, the values are all evaluated first, then bound to the variables at the end, simultaneously; whereas for letrec*, the values for each variable are evaluated and bound sequentially from left to right.


To demonstrate the four let types, I wrote a little Racket macro that allows you to test the behaviour of each:

#lang racket
(define-syntax test-let
  (syntax-rules ()
    ((_ let)
     (let ((x "outer x")
           (y "outer y")
           (p (lambda (x y label)
                (printf "~a: x = ~s, y = ~s~%" label x y))))
       (let ((before (p x y "before"))
             (x (begin
                  (p x y "during x")
                  "inner x"))
             (between (p x y "between"))
             (y (begin
                  (p x y "during y")
                  "inner y"))
             (after (p x y "after")))
         (p x y "body"))))))

And the test results:

> (test-let let)
before: x = "outer x", y = "outer y"
during x: x = "outer x", y = "outer y"
between: x = "outer x", y = "outer y"
during y: x = "outer x", y = "outer y"
after: x = "outer x", y = "outer y"
body: x = "inner x", y = "inner y"

> (test-let let*)
before: x = "outer x", y = "outer y"
during x: x = "outer x", y = "outer y"
between: x = "inner x", y = "outer y"
during y: x = "inner x", y = "outer y"
after: x = "inner x", y = "inner y"
body: x = "inner x", y = "inner y"

> (require rnrs/base-6)
> (test-let letrec)
before: x = #<undefined>, y = #<undefined>
during x: x = #<undefined>, y = #<undefined>
between: x = #<undefined>, y = #<undefined>
during y: x = #<undefined>, y = #<undefined>
after: x = #<undefined>, y = #<undefined>
body: x = "inner x", y = "inner y"

> (require rnrs/base-6)
> (test-let letrec*)
before: x = #<undefined>, y = #<undefined>
during x: x = #<undefined>, y = #<undefined>
between: x = "inner x", y = #<undefined>
during y: x = "inner x", y = #<undefined>
after: x = "inner x", y = "inner y"
body: x = "inner x", y = "inner y"

Hopefully this will make the differences between the let types pretty obvious. :-)



回答2:

define inside a procedure (or let. since it's the same) is a letrec until R5RS and a letrec* from R6RS and the current R7RS. You can do everything with letrec* except overshadow your own variables (they become undefined).

(let ((x 10))
  (letrec ((x x))
     x)) ;; returns some undefined value

becomes

(let ((x 10))
  (let ((x 'undefined))
     (let ((tmpx x)) ;; notice x is already shadowed by the new binding
        (set! x tmpx)
        x))) ;; returns some undefined value

The reason for that is because the evaluation of the variables needs to be done while the variable already is defined so that it can recurse. Your mutually recursive odd? depends on it. While two lets in the same way sets the new x to the old x since they never exist at the same time.

As you probably know (let ((x 10) (y 20)) ...) is nothing more than syntax sugar for an anonymous procedure call ((lambda (x y) ...) 10 20), let* is nested let (let* ((x 10) (y 20)) ...) => (let ((x 10)) (let ((y 20)) ... ))). A named let is a letrec and a letrec* is almost nested letrec (All the variables are made in the same let to uninitialized values so that all the levels can reference to them)

If you want to bind 10 variables and can do it in one closure it's the cheapest while doing it with define introduces 20 procedure calls and 10 mutations. It's probably nothing a modern interpreter can manage today but Scheme didn't perform that good back when it was invented and would do worse if it only had letrec*



标签: scheme lisp