cpp expansion of macro with no token-string

2019-04-26 19:21发布

I am reading on CPP macro expansion and wanted to understand expansion when the (optional) token-string is not provided. I found gcc v4.8.4 does this:

$ cat zz.c
#define B
(B)
|B|
$ gcc -E zz.c
# 1 "zz.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "zz.c"

()
| |

Can anyone explain why the expansion is zero spaces in one instance and one in the other?

3条回答
神经病院院长
2楼-- · 2019-04-26 19:37

edit: see hvd's answer about gcc's preprocessor implementation

This may be to differentiate between the bitwise and logical OR operators.

This sample:

if (x | 4) printf("true\n"); // Bitwise OR, may or may not be true

Is different from:

if (x || 4) printf("true\n"); // Always true

Since they are different operators with different functions, it is necessary for the preprocessor to add whitespace to avoid changing the intended meaning of the statement.

查看更多
Luminary・发光体
3楼-- · 2019-04-26 19:40

The output of gcc -E intentionally does not match the exact rules specified by the C standard. The C standard does not describe any particular way the preprocessor result should be visible, and does not even require such a way exist.

The only time some sort of preprocessor output is required to be visible is when the # operator is used. And if you use this, you can see that there isn't any space.

flaming.toaster's answer rightly points out that the reason the gcc -E output inserts a space is to prevent the two consecutive |s from being parsed as a single || token. The following program is required to give a diagnostic for the syntax error:

#define EMPTY
int main() { return 0 |EMPTY| 0; }

and the space is there to make sure the compiler still has enough information to actually generate the error.

查看更多
女痞
4楼-- · 2019-04-26 19:43

The C preprocessor operates on "tokens" and whenever there's a possibility of changing the meaning or ambiguity, it always adds whitespace in order to preserve the meaning.

Consider your example,

(B)

there's no ambiguity or meaning altering whether there's a space between ( and ) added or not irrespective of the macro value of B.

But it's not the case with

|B|

Depending on the macro B, this above could either be || or |something|. So preprocessor is forced to add a whitespace in order to keep C's lexical rules.

The same behaviour can be seen with any other token that could alter the meaning. For example,

#define B +
B+

would produce

+ +

as opposed to

++

for the said reason.

However, this is only the preprocessor that complies to C lexical rules. GCC does have and support an old preprocessor called traditional processor which wouldn't add any extra whitespaces. For example, if you call preprocessor in traditional mode:

gcc -E -traditional-cpp file.c

then

#define B 

(B)
|B|

produce (without the whitespace)

()
||
查看更多
登录 后发表回答