Just what the title says.
$ ./configure --help | grep -i ucs
--enable-unicode[=ucs[24]]
Searching the official documentation, I found this:
sys.maxunicode: An integer giving the largest supported code point for a Unicode character. The value of this depends on the configuration option that specifies whether Unicode characters are stored as UCS-2 or UCS-4.
What is not clear here is - which value(s) correspond to UCS-2 and UCS-4.
The code is expected to work on Python 2.6+.
sysconfig will tell the unicode size from the configuration variables of python.
The buildflags can be queried like this.
Python 2.7:
Python 2.6:
Another way is to create an Unicode array and look at the itemsize:
Quote from the
array
docs:Note that the distinction between narrow and wide Unicode builds is dropped from Python 3.3 onward, see PEP393. The
'u'
typecode forarray
is deprecated since 3.3 and scheduled for removal in Python 4.0.When built with --enable-unicode=ucs4:
When built with --enable-unicode=ucs2:
It's 0xFFFF (or 65535) for UCS-2, and 0x10FFFF (or 1114111) for UCS-4:
The maximum character in UCS-4 mode is defined by the maxmimum value representable in UTF-16.
I had this same issue once. I documented it for myself on my wiki at
http://arcoleo.org/dsawiki/Wiki.jsp?page=Python%20UTF%20-%20UCS2%20or%20UCS4
I wrote -
65535 is UCS-2: