String literal concatenation fails when prefixed s

2020-04-11 03:06发布

In MSVS2013, which I believe to be C++11 compliant, the compiler doesn't like the following:

LPCTSTR str = _T("boo " "hoo");

which translates to:

wchar_t const * str = L"boo " "hoo";

According to cppreference.com (which I know is not definitive, but it's the only reference I have at the moment):

  • String literals placed side-by-side are concatenated during compilation. That is, "Hello," " world!" yields the (single) string "Hello, world!".
    • If the two strings have the same encoding prefix (or neither has one), the resulting string will have the same encoding prefix (or no prefix).
    • If one of the strings has an encoding prefix and the other doesn't, the one that doesn't will be considered to have the same encoding prefix as the other.
    • If a UTF-8 string literal and a wide string literal are side by side, the program is ill-formed.
    • Any other combination of encoding prefixes may or may not be supported by the implementation. The result of such a concatenation is implementation-defined.

The emphasis is my own.

Can anyone confirm if this is in the standard as indicated by cppreference?

EDIT

By doesn't like, I mean I get the following error:

error C2308: concatenating mismatched strings

2条回答
祖国的老花朵
2楼-- · 2020-04-11 03:22

The 2003 ISO C++ standard, section 2.13.4p3, says:

In translation phase 6 (2.1), adjacent narrow string literals are concatenated and adjacent wide string literals are concatenated. If a narrow string literal token is adjacent to a wide string literal token, the behavior is undefined. Characters in concatenated strings are kept distinct.

The 2011 standard, section 2.14.5p13, says:

In translation phase 6 (2.2), adjacent string literals are concatenated. If both string literals have the same encoding-prefix, the resulting concatenated string literal has that encoding-prefix. If one string literal has no encoding-prefix, it is treated as a string literal of the same encoding-prefix as the other operand. If a UTF-8 string literal token is adjacent to a wide string literal token, the program is ill-formed. Any other concatenations are conditionally supported with implementation-defined behavior.

So the sequence L"boo " "hoo" has undefined behavior in C2003 but is well defined and equivalent to L"boohoo" in C2011.

I can't tell from the information you've given us whether MSVS2013 conforms to C++11. You say it "doesn't like" the construct, but if the dislike is expressed as a non-fatal warning and the semantics are as specified in the 2011 standard, then it could be conforming.

Can you update the question to show the diagnostic message?

查看更多
Anthone
3楼-- · 2020-04-11 03:38

From N3797, §2.14.5/13 [lex.string]

In translation phase 6 (2.2), adjacent string literals are concatenated. If both string literals have the same encoding-prefix, the resulting concatenated string literal has that encoding-prefix. If one string literal has no encoding-prefix, it is treated as a string literal of the same encoding-prefix as the other operand.

The table following that even lists an example that's the same as what you've shown

// Source         Means
L"a" "b"          L"ab"

So I'd say your code is well-formed and this is a VisualStudio bug.

查看更多
登录 后发表回答