I didn't know that C and C++ allow multicharacter literal
: not 'c' (of type int in C and char in C++), but 'tralivali' (of type int!)
enum
{
ActionLeft = 'left',
ActionRight = 'right',
ActionForward = 'forward',
ActionBackward = 'backward'
};
Standard says:
C99 6.4.4.4p10: "The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined."
I found they are widely used in C4 engine. But I suppose they are not safe when we are talking about platform-independend serialization. Thay can be confusing also because look like strings. So what is multicharacter literal's scope of usage, are they useful for something? Are they in C++ just for compatibility with C code? Are they considered to be a bad feature as goto operator or not?
Multicharacter literals allow one to specify
int
values via the equivalent representation in characters. Useful for enums, FourCC codes and tags, and non-type template parameters. With a multicharacter literal, a FourCC code can be typed directly into the source, which is handy.The implementation in gcc is described at https://gcc.gnu.org/onlinedocs/cpp/Implementation-defined-behavior.html . Note that the value is truncated to the size of the type
int
, so'efgh' == 'abcdefgh'
if your ints are 4 chars wide, although gcc will issue a warning on the literal that overflows.Unfortunately, gcc will issue a warning on all multi-character literals if
-pedantic
is passed, as their behavior is implementation-defined. As you can see above, it is perhaps possible for equality of two multi-character literals to change if you switch implementations.I don't know how extensively this is used, but "implementation-defined" is a big red-flag to me. As far as I know, this could mean that the implementation could choose to ignore your character designations and just assign normal incrementing values if it wanted. It may do something "nicer", but you can't rely on that behavior across compilers (or even compiler versions). At least "goto" has predictable (if undesirable) behavior...
That's my 2c, anyway.
Edit: on "implementation-defined":
From Bjarne Stroustrup's C++ Glossary:
also...
I believe this means the comment is correct: it should at least compile, although anything beyond that is not specified. Note the advice in the definition, also.
Four character literals, I've seen and used. They map to 4 bytes = one 32 bit word. It's very useful for debugging purposes as said above. They can be used in a switch/case statement with ints, which is nice.
This (4 Chars) is pretty standard (ie supported by GCC and VC++ at least), although results (actual values compiled) may vary from one implementation to another.
But over 4 chars? I wouldn't use.
UPDATE: From the C4 page: "For our simple actions, we'll just provide an enumeration of some values, which is done in C4 by specifying four-character constants". So they are using 4 chars literals, as was my case.
It makes it easier to pick out values in a memory dump.
Example:
vs.
a memory dump after the following statement:
might look like:
in the first case, vs:
using multicharacter literals. (of course whether it says 'stop' or 'pots' depends on byte ordering)
In C++14 specification draft N4527 section 2.13.3, entry 2:
Previous answers to your question pertained mostly on real machines that did support multicharacter literals. Specifically, on platforms where
int
is 4 bytes, four-byte multicharacter is fine and can be used for convenience, as per Ferrucio's mem dump example. But, as there is no guarantee that this will ever work or work the same way on other platforms, use of multicharacter literals should be deprecated for portable programs.