I created a small macro metaprogramming library that implements basic useful constructs such as REPEAT(times, x)
, IF(value, true, false)
, tuples, and more.
Most of my implementations work by overloading macros based upon their variadic argument count or through a counter:
// Example:
#define REPEAT_0(x)
#define REPEAT_1(x) x REPEAT_0(x)
#define REPEAT_2(x) x REPEAT_1(x)
#define REPEAT_3(x) x REPEAT_2(x)
// ...
// (these defines are generated using an external script)
// ...
#define REPEAT(count, x) CAT(REPEAT_, count)(x)
This works fine, but I've recently come across an extremely interesting implementation of macro recursion by Paul Fultz.
Up until the deferred expression section I had no trouble understanding his article.
I am, however, having a lot of trouble understanding the use of DEFER
and OBSTRUCT
properly.
Paul implements a very elegant version of REPEAT
that does not require script-generated defines like this:
#define EAT(...)
#define EXPAND(...) __VA_ARGS__
#define WHEN(c) IF(c)(EXPAND, EAT)
#define REPEAT(count, macro, ...) \
WHEN(count) \
( \
OBSTRUCT(REPEAT_INDIRECT) () \
( \
DEC(count), macro, __VA_ARGS__ \
) \
OBSTRUCT(macro) \
( \
DEC(count), __VA_ARGS__ \
) \
)
#define REPEAT_INDIRECT() REPEAT
//An example of using this macro
#define M(i, _) i
EVAL(REPEAT(8, M, ~)) // 0 1 2 3 4 5 6 7
DEFER
, OBSTRUCT
and other utilities are implemented as such:
#define EMPTY()
#define DEFER(id) id EMPTY()
#define OBSTRUCT(...) __VA_ARGS__ DEFER(EMPTY)()
#define EXPAND(...) __VA_ARGS__
#define A() 123
A() // Expands to 123
DEFER(A)() // Expands to A () because it requires one more scan to fully expand
EXPAND(DEFER(A)()) // Expands to 123, because the EXPAND macro forces another scan
When the preprocessor expands a macro, the result is "painted" until the next scan - it will not expand recursively unless an additional scan occurs. Is this correct?
Does the
EXPAND(...)
macro force an additional scan? If so, does this scan allow macros to expand recursively? What's the difference btweenEXPAND(...)
andDEFER(id)
?- Does
DEFER
force two additional scans?
- Does
What about the
OBSTRUCT(...)
macro? Does it force two additional scans?Now - why is
OBSTRUCT
required in the recursive implementation ofREPEAT
? Why wouldn'tDEFER
orEXPAND
work here?
The use of macros like
DEFER
, and complicated C macrology in general, depends on understanding how the C preprocessor actually expands macro expressions. It doesn't just attempt to reduce all expression trees the way a conventional programming language does, but rather, it works on a linear token stream, and has an implicit "cursor" at the point where it's currently examining the stream for possible replacements. Within any given "stack frame" of the expansion process, the cursor never moves backwards, and once a token has been passed in the stream it is not examined again.Walking through the first example of
DEFER
's operation:The cursor moved past
A
during the "rescan" ofDEFER
's body, after the arguments had been substituted, which is the second and last attempt to expand that set of tokens. Once the cursor has moved pastA
it does not return to it during that expansion sequence, and since the "rescan" is at the top level there is no following expansion sequence.Now consider the same expression, but wrapped in a call to
EXPAND
:Because argument lists are expanded in a stacked context, and the cursor position is restored to the position in front of the original call for the rescan pass, placing a macro invocation in any macro's argument list - even one that actually does nothing, like
EXPAND
- gives it a "free" extra run over by the expansion cursor, by resetting the cursor's position in the stream an extra time for an extra rescan pass, and therefore giving each freshly constructed call expression (i.e. the pushing together of a macro name and a parenthesized argument list) an extra chance at being recognised by the expander. So allEVAL
does is give you 363 (3^5+3^4+3^3+3^2+3, someone check my math) free rescan passes.So, addressing the questions in light of this:
DEFER
is rather to ensure that the recursive invocation doesn't get generated on the macro's expansion pass, but instead is ...deferred... until an outer rescan step, at which point it won't get painted blue because we're no longer within that named macro. This is whyREPEAT_INDIRECT
is function-like; so that it can be prevented from expanding into anything mentioning the name ofREPEAT
, as long as we're still within the body ofREPEAT
. It requires at least one further free pass after the originatingREPEAT
completes to expand away the spacingEMPTY
tokens.EXPAND
forces an additional expansion pass. Any macro invocation grants one extra expansion pass to whatever expression was passed in its argument list.DEFER
is that you don't pass it a whole expression, just the "function" part; it inserts a spacer between the function and its argument list that costs one expansion pass to remove.EXPAND
andDEFER
is thatDEFER
imposes the need for an extra pass, before the function you pass it gets expanded; andEXPAND
provides that extra pass. So they are each other's inverse (and applied together, as in the first example, are equivalent to a call using neither).OBSTRUCT
costs two expansion passes before the function you pass it gets expanded. It does this byDEFER
ing the expansion of theEMPTY()
spacer by one (EMPTY EMPTY() ()
), burning the first cursor reset on getting rid of the nestedEMPTY
.OBSTRUCT
is needed aroundREPEAT_INDIRECT
becauseWHEN
expands (on true) to a call toEXPAND
, which will burn away one layer of indirection while we're still within the call toREPEAT
. If there was only one layer (aDEFER
), the nestedREPEAT
would be generated while we're still inREPEAT
's context, causing it to be painted blue and killing the recursion right there. Using two layers withOBSTRUCT
means that the nestedREPEAT
won't be generated until we reach the rescan pass of whatever macro call is outsideREPEAT
, at which point we can safely generate the name again without it being painted blue, because it's been popped from the no-expand stack.So, this method of recursion works by using an external source of a large number of rescan passes (
EVAL
), and deferring the expansion of a macro's name within itself by at least one rescan pass so it happens further down the call stack. The first expansion of the body ofREPEAT
happens during the rescan ofEVAL[363]
, the recursive invocation is deferred until the rescan ofEVAL[362]
, the nested expansion deferred... and so on. It's not true recursion, as it can't form an infinite loop, but instead relies on an external source of stack frames to burn through.