I tried to understand the macros in c using the concatenation preprocessor operator ## but I realized that I have problem with tokens. I thought it was easy but in practice it is not.
So the concatenation is for concatenating two tokens to create a new token.
ex: concatenating (
and )
or int
and *
I tried
#define foo(x,y) x ## y
foo(x,y)
whenever I give it some arguments I get always error saying that pasting both argument does not give a valid preprocessor token.
For instance why concatenating foo(1,aa)
results in 1aa
(which type of token is it ? and why it is valid) but foo(int,*)
I got an error.
Is there a way to know which tokens are valid or is it possible to have some good link to understand how can clarify it in my mind. (I already googled in google and SO)
What am I missing ?
I will be grateful.
Preprocessing token is defined by the C language grammar, see section 6.4 of the current standard:
The meaning of each of those terms is defined elsewhere in the grammar. Most are self-explanatory;
identifier
means anything that is a valid variable name (or would be if it wasn't a keyword), andpp-number
includes integer and floating point constants.In Standard C, the result of pasting two preprocessing tokens must be another valid preprocessing token. Historically some preprocessors have allowed other pasting (which is equivalent to not pasting!) but this leads to confusion when people compile their code with a different compiler.
Since it seems to be a point of confusion, the string
1aa
is a valid preprocessor token; it is an instance ofpp-number
, whose definition is (§6.4.8 of the current C standard):In other words, a
pp-number
starts with a digit or a . followed by a digit, and after that it can contain any sequence of digits, "identifier-nondigits" (that is, letters, underscores, and other things which can be part of an identifier) or the letters e or p (either upper or lower-case) followed by a plus or minus sign.That means that, for example,
0x1e+2
is a validpp-number
, while0x1f+1
is not (it is three tokens). In a valid program, everypp-number
which survives the preprocessing phases must satisfy the syntax of some numeric constant representation, which means that a program which includes the text0x1e+2
will be considered invalid. The moral, if there is one, is that you should use whitespace generously; it has no cost.The intention of
pp-number
is to include everything which might eventually be a number in some future version of C. (Remember that numbers can be followed by alphabetic suffixes indicating types and signedness, such as27LU
).However,
int*
is not a valid preprocessor token. It is two tokens (as is-3
) and so it cannot be formed with the token concatenation operator.Another odd consequence of the token-pasting rule is that it is impossible to generate the valid token
...
through token concatenation, because..
is not a valid token. (a##b##c
must be evaluated in some order, so even if all three preprocessor macros expand to ., there must be an attempt to create the token..
, which will fail in must compilers, although I believe Visual Studio accepts it.)Finally, comment symbols
/*
and//
are not tokens; comments are replaced with whitespace before the separation of the program text into tokens. So you cannot produce a comment with token-pasting either (at least, not in a compliant compiler).Preprocessor token concatenation is for generating new tokens, but it is not capable of pasting arbitrary language constructs together (confer, for example, gcc documentation):
So an attempt at a macro that makes a pointer out of a type like
is invalid, as
int*
are two tokens, not one.The example of above mentioned link, however, shows the generation of new tokens:
yields:
Token
quit_command
has not existed before but has been generated through token concatenation.Note that a macro of the form
is valid and actually generates a pointer type out of
TYPE
, e.g.int*
out ofint
.