How to convert a string to unicode/byte string in

2019-09-09 09:59发布

I know this works:

a = u"\u65b9\u6cd5\uff0c\u5220\u9664\u5b58\u50a8\u5728"
print(a) # 方法，删除存储在

But if I have a string from a JSON file which does not start with "u"(a = "\u65b9\u6cd5\uff0c\u5220\u9664\u5b58\u50a8\u5728"), I know how to make it in Python 2 (print unicode(a, encoding='unicode_escape') # Prints 方法，删除存储在). But how to do it with Python 3?

Similarly, if it's a byte string loaded from a file, how to convert it?

print("好的".encode("utf-8"))  # b'\xe5\xa5\xbd\xe7\x9a\x84'
# how to convert this?
b = '\xe5\xa5\xbd\xe7\x9a\x84'  # 好的

标签： python python-3.x unicode encode codec

1条回答

ゆ、 Hurt°

2楼-- · 2019-09-09 10:24

If I understand correctly, the file contains the literal text \u65b9\u6cd5\uff0c\u5220\u9664\u5b58\u50a8\u5728 (so it's plain ASCII, but with backslashes and all that describe the Unicode ordinals the same way you would in a Python str literal). If so, there are two ways to handle this:

Read the file in binary mode, then call mystr = mybytes.decode('unicode-escape') to convert from the bytes to str interpreting the escapes
Read the file in text mode, and use the codecs module for the "text -> text" conversion (bytes to bytes and text to text codecs are now supported only by the codecs module functions; bytes.decode is purely for bytes to text and str.encode is purely for text to bytes, because usually, in Py2, str.encode and unicode.decode was a mistake, and removing the dangerous methods makes it easier to understand what direction the conversions are supposed to go), e.g. decodedstr = codecs.decode(encodedstr, 'unicode-escape')

0人赞添加讨论(0) 举报

How to convert a string to unicode/byte string in

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间