How to get Xcode 8 C preprocessor to ignore // com

2019-05-01 15:03发布

The C preprocessor (cpp) seems like it should handle this code correctly:

#define A 1 // hello there

int foo[A];

I would expect to replace A with 1.

What happens is that A is replaced with 1 // hello there, which results in the following output from cpp -std=c99 test.c:

# 1 "test.c"

int foo[1 // hello there];

Which is not valid C and fails to compile.

How can I get cpp to perform the proper replacement?

Note on compiler: Using cpp from the latest (8.2.1, Dec 2016) Xcode on mac, so I doubt it's due to an outdated compiler.

3条回答
做个烂人
2楼-- · 2019-05-01 15:22

Somewhat to my surprise, I can reproduce the problem on my Mac (macOS Sierra 10.12.2; Apple LLVM version 8.0.0 (clang-800.0.42.1)) using /usr/bin/cpp which is the XCode cpp — but not using GNU cpp (which I invoke using just cpp).

Workarounds include:

/usr/bin/gcc -E -std=c99 test.c

This uses the clang wrapper gcc to run the C preprocessor and correctly handles the version. You could add a -v option and see what it runs; I didn't see it running cpp per se (it runs clang -cc1 -E with lots of other information).

You can also use:

clang -E -std=c99 test.c

It's effectively the same thing.

You could also install GCC and use that instead of XCode. There are questions with answers about how to get that done (but it isn't for the faint of heart).

查看更多
看我几分像从前
3楼-- · 2019-05-01 15:31

From the C11 specification (emphasis added):

5.1.1.2 Translation phases

The precedence among the syntax rules of translation is specified by the following phases6).

  1. [...] multibyte characters are mapped [...] to the source character set [...] Trigraph sequences are replaced [...]

  2. Each instance of a backslash character () immediately followed by a new-line character is deleted, splicing physical source lines [...]

  3. The source file is decomposed into preprocessing tokens and sequences of white-space characters (including comments). [...] Each comment is replaced by one space character. [...]

  4. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. [...]

where note 6) states:

Implementations shall behave as if these separate phases occur, even though many are typically folded together in practice. Source files, translation units, and translated translation units need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation. The description is conceptual only, and does not specify any particular implementation.

Hence, an implementation conforming to the C11 specification is not required to have a separate preprocessor. Which means that the cpp command can do whatever it wants. And the compiler driver is allowed to perform phases 1 through 3 be any means it wants. So the correct way to get the output after preprocessing is to invoke the compiler driver with cc -E.

查看更多
ゆ 、 Hurt°
4楼-- · 2019-05-01 15:34

Note that // is not a valid C90 comment. It was introduced in C99, so make sure your compiler and pre-processor know they're to use the C99 standard. In many that's -std=c99. (The question was since edited to make that clear)


Next is that I don't believe the pre-processor cares about comments. From the 6.10 of the C99 spec shows the grammar of preprocessor directives and nowhere does it mention comments...

The ANSI C standard makes it clear comments are supposed to be replaced in 2.1.1.2 "Translation Phases" phase 3 (5.1.1.2 in C99). (Drawing from this other answer).

  1. The source file is decomposed into preprocessing tokens and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment. Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined.

Older tools might not have followed that either because they predate any C standard or they had bugs or they interpreted the standard differently. They've likely retained those bugs/quirks for backwards compatibility. Testing with clang -E -std=c99 vs /usr/bin/cpp -std=c99 confirms this. They behave differently despite being the same compiler under the hood.

$ /usr/bin/cpp --version
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

$ clang --version
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

$ ls -l /usr/bin/cpp
-rwxr-xr-x 1 root wheel 18240 Dec 10 01:04 /usr/bin/cpp
$ ls -l /usr/bin/clang
-rwxr-xr-x 1 root wheel 18240 Dec 10 01:04 /usr/bin/clang


$ /usr/bin/cpp -std=c99 test.c
# 1 "test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 330 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "test.c" 2


int foo[1 // hello there];

$ /usr/bin/clang -E -std=c99 test.c
# 1 "test.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 331 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "test.c" 2


int foo[1];

I suspect invoking clang as /usr/bin/cpp is causing bug/quirk compatibility with the original behavior of cpp established back when the behavior was unclear.

I guess the lesson here is to use cc -E rather than cpp to ensure consistent behavior.

查看更多
登录 后发表回答