Convert a symbol to its 4 digit unicode escape rep

2019-04-30 00:10发布

1) How can I convert a symbol to its 4 digit Unicode escape representation in python 2.7 e.g "¥" to "\u00a5"?

2) How can I convert a Unicode representation to the symbol notation on Windows 7/8 platform e.g "\u00a5" to "¥"?

2条回答
Explosion°爆炸
2楼-- · 2019-04-30 00:45

1) Does it need to be \u-escaped? Will \x work? If so, try the unicode_escape codec. Otherwise, you can convert using the function below:

def four_digit_escape(string):
    return u''.join(char if 32 <= ord(char) <= 126 else u'\\u%04x'%ord(char) for char in string)

symbol = u"hello ¥"
print symbol.encode('unicode_escape')
print four_digit_escape(symbol)

2) Similarly, you can use the unicode_escape codec:

encoded_symbol = '\\u00a5'
print encoded_symbol
print encoded_symbol.decode('unicode_escape')
查看更多
祖国的老花朵
3楼-- · 2019-04-30 00:56

The most reliable way I found to do this in python is to first decode it into unicode, get the ord of the unicode character and plug that into a format string. It looks like this:

"\\u%04x" % ord("¥".decode("utf-8"))

There is also a method unichr that is supposed to output something like this, but on my system it displays a different encoding than what the op wanted. So the above solution is the most platform independent way that I can think of.

查看更多
登录 后发表回答