I am reading on CPP macro expansion and wanted to understand expansion when the (optional) token-string is not provided. I found gcc v4.8.4 does this:
$ cat zz.c
#define B
(B)
|B|
$ gcc -E zz.c
# 1 "zz.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "zz.c"
()
| |
Can anyone explain why the expansion is zero spaces in one instance and one in the other?
edit: see hvd's answer about gcc's preprocessor implementation
This may be to differentiate between the bitwise and logical OR operators.
This sample:
Is different from:
Since they are different operators with different functions, it is necessary for the preprocessor to add whitespace to avoid changing the intended meaning of the statement.
The output of
gcc -E
intentionally does not match the exact rules specified by the C standard. The C standard does not describe any particular way the preprocessor result should be visible, and does not even require such a way exist.The only time some sort of preprocessor output is required to be visible is when the
#
operator is used. And if you use this, you can see that there isn't any space.flaming.toaster's answer rightly points out that the reason the
gcc -E
output inserts a space is to prevent the two consecutive|
s from being parsed as a single||
token. The following program is required to give a diagnostic for the syntax error:and the space is there to make sure the compiler still has enough information to actually generate the error.
The C preprocessor operates on "tokens" and whenever there's a possibility of changing the meaning or ambiguity, it always adds whitespace in order to preserve the meaning.
Consider your example,
there's no ambiguity or meaning altering whether there's a space between
(
and)
added or not irrespective of the macro value ofB
.But it's not the case with
Depending on the macro
B
, this above could either be||
or|something|
. So preprocessor is forced to add a whitespace in order to keep C's lexical rules.The same behaviour can be seen with any other token that could alter the meaning. For example,
would produce
as opposed to
for the said reason.
However, this is only the preprocessor that complies to C lexical rules. GCC does have and support an old preprocessor called traditional processor which wouldn't add any extra whitespaces. For example, if you call preprocessor in traditional mode:
then
produce (without the whitespace)