How to convert string containing unicode escape \u

2019-02-20 17:48发布

I am trying this since morning.

My sample.txt

choice = \u9078\u629e

Code:

with open('sample.txt', encoding='utf-8') as f:
    for line in f:
        print(line)
        print("選択" in line)
        print(line.encode('utf-8').decode('utf-8'))
        print(line.encode().decode('utf-8'))
        print(line.encode('utf-8').decode())
        print(line.encode().decode('unicode-escape').encode("latin-1").decode('utf-8')) # as suggested.

out:
choice = \u9078\u629e
False
choice = \u9078\u629e
choice = \u9078\u629e
choice = \u9078\u629e
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 9-10: ordinal not in range(256)

When I do this in ipython qtconsole:

In [29]: "choice = \u9078\u629e"
Out[29]: 'choice = 選択'

So the question is how can I read the text file containing the unicode escaped string like \u9078\u629e (I don't know exactly what it's called) and convert it to utf-8 like 選択?

标签： python python-3.x unicode python-unicode

1条回答

别忘想泡老子

2楼-- · 2019-02-20 18:28

If you read it from a file, just give the encoding when opening:

with open('test.txt', encoding='unicode-escape') as f:    
    a = f.read()
print(a)

# choice = 選択

with test.txt containing:

choice = \u9078\u629e

If you already had your text in a string, you could have converted it like this:

a = "choice = \\u9078\\u629e"
a.encode().decode('unicode-escape')
# 'choice = 選択'

0人赞添加讨论(0) 举报

How to convert string containing unicode escape \u

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间