Python unicode codepoint to unicode character

2019-02-04 17:55发布

I'm trying to write out to a flat file some Chinese, or Russian or various non-English character-sets for testing purposes. I'm getting stuck on how to output a Unicode hex-decimal or decimal value to its corresponding character.

For example in Python, if you had a hard coded set of characters like абвгдежзийкл you would assign value = u"абвгдежзийкл" and no problem.

If however you had a single decimal or hex decimal like 1081 / 0439 stored in a variable and you wanted to print that out with it's corresponding actual character (and not just output 0x439) how would this be done? The Unicode decimal/hex value above refers to й.

3条回答
We Are One
2楼-- · 2019-02-04 18:26

Python 2: Use unichr():

>>> print(unichr(1081))
й

Python 3: Use chr():

>>> print(chr(1081))
й
查看更多
走好不送
3楼-- · 2019-02-04 18:41

So the answer to the question is:

  1. convert the hexadecimal value to decimal with int(hex_value, 16)
  2. then get the corresponding strin with chr().

To sum up:

>>> print(chr(int('0x897F', 16)))
西
查看更多
疯言疯语
4楼-- · 2019-02-04 18:47

If you run into the error:

ValueError: unichr() arg not in range(0x10000) (narrow Python build)

While trying to convert your hex value using unichr, you can get around that error by doing something like:

>>> n = int('0001f600', 16)
>>> s = '\\U{:0>8X}'.format(n)
>>> s
'\\U0001F600'
>>> binary = s.decode('unicode-escape')
>>> print(binary)
                                                                    
查看更多
登录 后发表回答