GCC supports -fshort-wchar that switches wchar_t from 4, to two bytes.
What is the best way to detect the size of wchar_t at compile time, so I can map it correctly to the appropriate utf-16 or utf-32 type?
At least, until c++0x is released and gives us stable utf16_t and utf_32_t typedefs.
#if ?what_goes_here?
typedef wchar_t Utf32;
typedef unsigned short Utf16;
#else
typedef wchar_t Utf16;
typedef unsigned int Utf32;
#endif
You can use the macros
__WCHAR_MAX__
__WCHAR_TYPE__
They are defined by gcc. You can check their value with echo "" | gcc -E - -dM
As the value of __WCHAR_TYPE__
can vary from int
to short unsigned int
or long int
, the best for your test is IMHO to check if __WCHAR_MAX__
is above 2^16.
#if __WCHAR_MAX__ > 0x10000
typedef ...
#endif
template<int>
struct blah;
template<>
struct blah<4> {
typedef wchar_t Utf32;
typedef unsigned short Utf16;
};
template<>
struct blah<2> {
typedef wchar_t Utf16;
typedef unsigned int Utf32;
};
typedef blah<sizeof(wchar_t)>::Utf16 Utf16;
typedef blah<sizeof(wchar_t)>::Utf32 Utf32;
You can use the standard macro: WCHAR_MAX
:
#include <wchar.h>
#if WCHAR_MAX > 0xFFFFu
// ...
#endif
WCHAR_MAX
Macro was defined by ISO C and ISO C++ standard (see: ISO/IEC 9899 - 7.18.3 Limits of other integer types and ISO/IEC 14882 - C.2), so you could use it safely on almost all compilers.
As Luther Blissett said, wchar_t exists independently from Unicode - they are two different things.
If you are really talking about UTF-16 - be aware that there are unicode characters which map to two 16-bit words (U+10000..U+10FFFF, although these are rarely used in western countries/languages).
The size depends on the compiler flag -fshort-wchar:
g++ -E -dD -fshort-wchar -xc++ /dev/null | grep WCHAR
#define __WCHAR_TYPE__ short unsigned int
#define __WCHAR_MAX__ 0xffff
#define __WCHAR_MIN__ 0
#define __WCHAR_UNSIGNED__ 1
#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
#define __SIZEOF_WCHAR_T__ 2
#define __ARM_SIZEOF_WCHAR_T 4
$ g++ -E -dD -xc++ /dev/null | grep WCHAR
#define __WCHAR_TYPE__ int
#define __WCHAR_MAX__ 2147483647
#define __WCHAR_MIN__ (-__WCHAR_MAX__ - 1)
#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
#define __SIZEOF_WCHAR_T__ 4