Include )\" in raw string literal without terminat

2020-02-18 03:05发布

问题:

The two characters )" terminate the raw string literal in the example below.
The sequence )" could appear in my text at some point, and I want the string to continue even if this sequence is found within it.

R"(  
    Some Text)"  
)";       // ^^

How can I include the sequence )" within the string literal without terminating it?

回答1:

Raw string literals let you specify an almost arbitrary* delimiter:

//choose ### as the delimiter so only )###" ends the string
R"###(  
    Some Text)"  
)###";  

*The exact rules are: "any member of the basic source character set except: space, the left parenthesis (, the right parenthesis ), the backslash \, and the control characters representing horizontal tab, vertical tab, form feed, and newline" (N3936 §2.14.5 [lex.string] grammar) and "at most 16 characters" (§2.14.5/2)



回答2:

Escaping won't help you since this is a raw literal, but the syntax is designed to allow clear demarcation of start and end, by introducing a little arbitrary phrase like aha.

R"aha(  
    Some Text)"  
)aha";

By the way note the order of ) and " at the end, opposite of your example.


Regarding the formal, at first sight (studying the standard) it might seem as if escaping works the same in raw string literals as in ordinary literals. Except one knows that it doesn't, so how is that possible, when no exception is noted in the rules? Well, when raw string literals were introduced in C++11 it was by way of introducing an extra undoing translation phase, undoing the effect of e.g. escaping!, to wit, …

C++11 §2.5/3

Between the initial and final double quote characters of the raw string, any transformations performed in phases 1 and 2 (trigraphs, universal-character-names, and line splicing) are reverted; this reversion shall apply before any d-char, r-char, or delimiting parenthesis is identified.

This takes care of Unicode character specifications (the universal-character-names like \u0042), which although they look and act like escapes are formally, in C++, not escape sequences.

The true formal escapes are handled, or rather, not handled!, by using a custom grammar rule for the content of a raw string literal. Namely that in C++ §2.14.5 the raw-string grammar entity is defined as

" d-char-sequenceopt ( r-char-sequenceopt ) d-char-sequenceopt "

where an r-char-sequence is defined as a sequence of r-char, each of which is

any member of the source character set, except a right parenthesis ) followed by the initial d-char-sequence [like aha above] (which may be empty) followed by a double quote "


Essentially the above means that not only can you not use escapes directly in raw strings (which is much of the point, it's positive, not negative), you can't use Unicode character specifications directly either.

Here's how to do it indirectly:

#include <iostream>
using namespace std;

auto main() -> int
{
    cout << "Ordinary string with a '\u0042' character.\n";
    cout << R"(Raw string without a '\u0042' character, and no \n either.)" "\n";
    cout << R"(Raw string without a '\u0042' character, i.e. no ')" "\u0042" R"(' character.)" "\n";
}

Output:

Ordinary string with a 'B' character.
Raw string without a '\u0042' character, and no \n either.
Raw string without a '\u0042' character, i.e. no 'B' character.


回答3:

You can use,

R"aaa(  
    Some Text)"  
)aaa"; 

Here aaa will be your string delimiter.