What's special about R and L in the C++ prepro

2019-02-02 22:30发布

I ran the following code through the Visual Studio 2013 preprocessor. The output surprises me.

Contents of hello.cpp:

#define A(j) #j

A(A?)
A(B?)
A(C?)
A(D?)
A(E?)
A(F?)
A(G?)
A(H?)
A(I?)
A(J?)
A(K?)
A(L?)
A(M?)
A(N?)
A(O?)
A(P?)
A(Q?)
A(R?)
A(S?)
A(T?)
A(U?)
A(V?)
A(W?)
A(X?)
A(Y?)
A(Z?)

The command:

cl /P hello.cpp

hello.i contains:

#line 1 "hello.cpp"



"A?"
"B?"
"C?"
"D?"
"E?"
"F?"
"G?"
"H?"
"I?"
"J?"
"K?"
"L"
"M?"
"N?"
"O?"
"P?"
"Q?"
"R"
"S?"
"T?"
"U?"
"V?"
"W?"
"X?"
"Y?"
"Z?"

I ran into this while trying to call A(L?p:q), which resulted in "Lp:q" which is not good for me.

Is this proper, well-defined C++? What's special about L and R in C++? If the file has the .c extension, L and R are treated identical to the rest of the alphabet. Is this related to C++11? It must be a new feature, since older versions of MSVS don't tread L and R in a special way.

And what can I do to stop MSVS 2013 from treating L and R in this special way?

2条回答
兄弟一词,经得起流年.
2楼-- · 2019-02-02 23:00

This appears to be a bug in the MSVC preprocessor. The good news is that depending on how picky you are with your output, you can work around the issue by putting a space after the R or L.

A(L ?p:q), // "L ?p:q"
查看更多
ゆ 、 Hurt°
3楼-- · 2019-02-02 23:13

Update

Looks like the bug report was marked as a duplicate of this one which has an update which says:

A fix for this issue has been checked into the compiler sources. The fix should show up in the next major release of Visual C++.

Original

As remyabel pointed out this is a reported bug. Neither gcc nor clang produce this results and the stringizing operator # according to Visual Studios documents, these are the following replacements (emphasis mine going forward):

White space preceding the first token of the actual argument and following the last token of the actual argument is ignored. Any white space between the tokens in the actual argument is reduced to a single white space in the resulting string literal. Thus, if a comment occurs between two tokens in the actual argument, it is reduced to a single white space. The resulting string literal is automatically concatenated with any adjacent string literals from which it is separated only by white space.

Further, if a character contained in the argument usually requires an escape sequence when used in a string literal (for example, the quotation mark (") or backslash () character), the necessary escape backslash is automatically inserted before the character.

which corresponds with the C++ draft standard section 16.3.2 The # operator which says:

If, in the replacement list, a parameter is immediately preceded by a # preprocessing token, both are replaced by a single character string literal preprocessing token that contains the spelling of the preprocessing token sequence for the corresponding argument. Each occurrence of white space between the argument’s preprocessing tokens becomes a single space character in the character string literal. White space before the first preprocessing token and after the last preprocessing token comprising the argument is deleted. Otherwise, the original spelling of each preprocessing token in the argument is retained in the character string literal, except for special handling for producing the spelling of string literals and character literals: a \ character is inserted before each " and \ character of a character literal or string literal (including the delimiting " characters).

The only thing that relates R and L with respect to C++11 is that they have special meaning with string literals but I don't see how that should effect this case.

It also looks like L\ and R\ also produce the same issue.

They do document one non-compliant issue and it says:

Visual C++ does not behave correctly when the # (stringize) operator is used with strings that include escape sequences. In this situation, the compiler will generate Compiler Error C2017.

which does not cover this case.

查看更多
登录 后发表回答