Understanding how to implement once-only lisp macr

2020-02-17 04:14发布

问题:

In Peter Seibel's book "Practical Common Lisp", we can find the definition of the very complicated macro once-only (see the bottom of page http://www.gigamonkeys.com/book/macros-defining-your-own.html).

I'm reading this macro definition for the 10th times in last 3 weeks and cannot understand how it works. :( Worse, I cannot develop this macro on my own, even though I understand its purpose and how to use it.

I'm especially interested in systematic "derivation" of this notoriously hard macro, step by step! Any help?

回答1:

Are you looking at this:

(defmacro once-only ((&rest names) &body body)
  (let ((gensyms (loop for n in names collect (gensym))))
    `(let (,@(loop for g in gensyms collect `(,g (gensym))))
      `(let (,,@(loop for g in gensyms for n in names collect ``(,,g ,,n)))
        ,(let (,@(loop for n in names for g in gensyms collect `(,n ,g)))
           ,@body)))))

It's not that complicated, but it does have a nested backquote, and multiple levels which are similar to each other, leading to easy confusion, even for experienced Lisp coders.

This is a macro which is used by macros for writing their expansions: a macro which writes parts of the bodies of macros.

There is a plain let in the body of the macro itself, then a once-backquoted generated let which will live inside the body of the macro which usess once-only. Finally, there is a doubly backquoted let which will appear in the macro expansion of that macro, in the code site where the macro is used by the user.

The two rounds of generating gensyms are necessary because once-only is a macro itself, and so it has to be hygienic for its own sake; so it generates a bunch of gensyms for itself in the outermost let. But also, the purpose of once-only is to simplify the writing of another hygienic macro. So it generates gensyms for that macro also.

In a nutshell, once-only needs to create a macro-expansion which requires some local variables whose values are gensyms. Those local variables will be used to insert the gensyms into another macro expansion to make it hygienic. And those local variables have to themselves be hygienic since they are a macro expansion, so they are also gensyms.

If you're writing a plain macro, you have local variables which hold gensyms, e.g.:

;; silly example
(defmacro repeat-times (count-form &body forms)
  (let ((counter-sym (gensym)))
    `(loop for ,counter-sym below ,count-form do ,@forms)))

In the process of writing the macro, you have invented a symbol, counter-sym. This variable is defined in plain view. You, the human, have chosen it in such a way that it does not clash with anything in the lexical scope. The lexical scope in question is that of your macro. We don't have to worry about counter-sym accidentally capturing references inside count-form or forms because forms are just data which is going into a piece of code which will end up inserted in some remote lexical scope (the site where the macro is used). We do have to worry about not confusing counter-sym with another variable inside our macro. For instance, we cannot give our local variable the name count-form. Why? Because that name is one of our function arguments; we would shadow it, creating a programming error.

Now if you want a macro to help you write that macro, then the machine has to do the same job as you. When it is writing code, it has to invent a variable name, and it has to be careful about what name it invents.

However, the code-writing machine, unlike you, does not see the surrounding scope. It cannot simply look at what variables are there and choose ones which do not clash. The machine is just a function which takes some arguments (pieces of unevaluated code) and produces a piece of code that is then blindly substituted into a scope after that machine has done its job.

Therefore, the machine has to choose the names extra wisely. In fact, to be completely bullet proof, it has to be paranoid and use symbols which are completely unique: gensyms.

So continuing with the example, suppose we have a robot which will write this macro body for us. That robot can be a macro, repeat-times-writing-robot:

(defmacro repeat-times (count-form &body forms)
  (repeat-times-writing-robot count-form forms))  ;; macro call

What might the robot macro look like?

(defmacro repeat-times-writing-robot (count-form forms)
  (let ((counter-sym-sym (gensym)))     ;; robot's gensym
    `(let ((,counter-sym-sym (gensym))) ;; the ultimate gensym for the loop
      `(loop for ,,counter-sym-sym below ,,count-form do ,@,forms))))

You can see how this has some of the features of once-only: the double nesting and the two levels of (gensym). If you can understand this, then the leap to once-only is small.

Of course, if we just wanted a robot to write repeat-times, we would make it a function, and then that function wouldn't have to worry about inventing variables: it is not a macro and so it doesn't need hygiene:

 ;; i.e. regular code refactoring: a piece of code is moved into a helper function
 (defun repeat-times-writing-robot (count-form forms)
   (let ((counter-sym (gensym)))
     `(loop for ,counter-sym below ,count-form do ,@forms)))

 ;; ... and then called:
(defmacro repeat-times (count-form &body forms)
  (repeat-times-writing-robot count-form forms))  ;; just a function now

But once-only cannot be a function because its job is to invent variables on behalf of its boss, the macro which uses it, and a function cannot introduce variables into its caller.



回答2:

An alternative to the once-only macro from Practical Common Lisp is derived in Let Over Lambda (see the 'Once Only' section in the third chapter).



回答3:

Kaz explained it beautifully and extensively.

However, if you wouldn't care much about the double-hygiene problem, you might find this one easier to understand:

(defmacro once-only ((&rest symbols) &body body)
  ;; copy-symbol may reuse the original symbol name
  (let ((uninterned-symbols (mapcar 'copy-symbol symbols)))
    ;; For the final macro expansion:
    ;; Evaluate the forms in the original bound symbols into fresh bindings
    ``(let (,,@(mapcar #'(lambda (uninterned-symbol symbol)
                           ``(,',uninterned-symbol ,,symbol))
                       uninterned-symbols symbols))
        ;; For the macro that is using us:
        ;; Bind the original symbols to the fresh symbols
        ,(let (,@(mapcar #'(lambda (symbol uninterned-symbol)
                             `(,symbol ',uninterned-symbol))
                         symbols uninterned-symbols))
           ,@body))))

The first let is backquoted twice, because it'll be part of the final expansion. The purpose is to evaluate the forms in the original bound symbols into fresh bindings.

The second let is backquoted once, because it'll be part of the user of once-only. The purpose is to rebind the original symbols to the fresh symbols, since their forms will have been evaluated and bound to them in the final expansion.

If the rebinding of the original symbols was before the final macro expansion, the final macro expansion would refer to the uninterned symbols instead of the original forms.

An implementation of with-slots that uses once-only is an example that requires double-hygiene:

(defmacro with-slots ((&rest slots) obj &body body)
  (once-only (obj)
    `(symbol-macrolet (,@(mapcar #'(lambda (slot)
                                     `(,slot (slot-value ,obj ',slot)))
                                 slots))
       ,@body)))

;;; Interaction in a REPL    
> (let ((*gensym-counter* 1)
        (*print-circle* t)
        (*print-level* 10))
    (pprint (macroexpand `(with-slots (a) (make-object-1)
                            ,(macroexpand `(with-slots (b) (make-object-2)
                                             body))))))

;;; With the double-hygienic once-only
(let ((#1=#:g2 (make-object-1)))
  (symbol-macrolet ((a (slot-value #1# 'a)))
    (let ((#2=#:g1 (make-object-2)))
      (symbol-macrolet ((b (slot-value #2# 'b)))
        body))))

;;; With this version of once-only
(let ((#1=#:obj (make-object-1)))
  (symbol-macrolet ((a (slot-value #1# 'a)))
    (let ((#1# (make-object-2)))
      (symbol-macrolet ((b (slot-value #1# 'b)))
        body))))

The second expansion shows that the inner let is shadowing the binding to the variable #:obj of the outter let. Thus, accessing a within the inner with-slots would actually access the second object.

Note that in this example, the outter macro-expansion gets a gensym named g2 and the inner g1. In normal evaluation or compilation, it would be the opposite, as forms are walked from the outter to the inner.