C++11 introduced the raw string literals which can be pretty useful to represent quoted strings, literals with lots of special symbols like windows file paths, regex expressions etc...
std::string path = R"(C:\teamwork\new_project\project1)"; // no tab nor newline!
std::string quoted = R"("quoted string")";
std::string expression = R"([\w]+[ ]+)";
This raw string literals can also be combined with encoding prefixes (u8
, u
, U
, or L
), but, when no encoding prefix is specified, does the file encoding matters?, lets suppose that I have this code:
auto message = R"(Pick up a card)"; // raw string 1
auto cards = R"(
Yes it matters, even to compile your source. You will gonna need to use somenthing like
-finput-charset=UTF-16
to compile if you are usinggcc
(the same thing should apply to VS).But I IHMO, there are something more fundamental to take into account in your code. For example,
std::string
are containers tochar
, which is 1 byte large. If you are dealing with a UTF-16 for instance, you will need 2 bytes, so (despite a 'by-hand conversion') you will need at least awchar_t
(std::wstring) (or, to be safer achar16_t
, to be safer inC++11
).So, to use Unicode you will need a container for it and a compiling environment prepared to handle your Unicode codifided sources.
Raw string literals change how escapes are dealt with but do not change how encodings are handled. Raw string literals still convert their contents from the source encoding to produce a string in the appropriate execution encoding.
The type of a string literal and the appropriate execution encoding is determined entirely by the prefix.
R
alone always produces achar
string in the narrow execution encoding. If the source is UTF-16 (and the compiler supports UTF-16 as the source encoding) then the compiler will convert the string literal contents from UTF-16 to the narrow execution encoding.