可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

don't know wether this is trivial or not, but I'd need to convert an unicode string to ascii string, and I wouldn't like to have all those escape chars around. I mean, is it possible to have an "approximate" conversion to some quite similar ascii character?

For example: Gavin O’Connor gets converted to Gavin O\x92Connor, but I'd really like it to be just converted to Gavin O'Connor. Is this possible? Did anyone write some util to do it, or do I have to manually replace all chars?

Thank you very much! Marco

回答1:

Use the Unidecode package to transliterate the string.

>>> import unidecode
>>> unidecode.unidecode(u'Gavin O’Connor')
"Gavin O'Connor"

回答2:

b = str(a.encode('utf-8').decode('ascii', 'ignore'))

should work fine.

回答3:

import unicodedata

unicode_string = u"Gavin O’Connor"
print unicodedata.normalize('NFKD', unicode_string).encode('ascii','ignore')

Output:

Gavin O'Connor

Here's the document that describes the normalization forms: http://unicode.org/reports/tr15/

回答4:

There is a technique to strip accents from characters, but other characters need to be directly replaced. Check this article: http://effbot.org/zone/unicode-convert.htm

回答5:

Try simple character replacement

str1 = "“I am the greatest”, said Gavin O’Connor"
print(str1)
print(str1.replace("’", "'").replace("“","\"").replace("”","\""))

PS: add # -*- coding: utf-8 -*- to the top of your .py file if you get error

Approximately converting unicode string to ascii s

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

收藏的人(0)

Approximately converting unicode string to ascii s

问题:

回答1:

回答2:

回答3:

回答4:

回答5:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮