Unicode symbols in output file in Python 3.6.1

2019-08-01 14:33发布

I need to log Connection errors to log.txt. Windows is Russian. My code:

    # e is a name for "requests.ConnectionError" form Windows if server is not avilable
    # I take error and cut from it text I need and convert it to str
    e_warning = str(e.args[0].reason)
    # I search text I need in string with "re"
    e_lst = re.findall('>:\s(.+)', e_warning)
    # I create string again from list "re" gives me
    e_str = ''.join(e_lst)
    # I Convert string to bytes
    e_str_unicode = codecs.encode(e_str, 'utf-8')
    # It is a message to warning window
    e_str_utf = codecs.decode(e_str_unicode, encoding='utf-8')
    messagebox.showerror(title='Connection error', message=e_str)
        with codecs.open('log.txt', 'a', encoding='utf-8') as log:
        log.write(strftime(str("%H:%M:%S %Y-%m-%d") + str(e_str_unicode) + '\n'))

If I use "e_str_utf" in the last line it gives me:

UnicodeEncodeError: 'locale' codec can't encode character '\u041f' in position 72: Illegal byte sequence

Make sense - 72 is first Russian letter. If I use "e_str_unicode" in the last line it is no error, but in log file I see:

15:25:18 2017-04-28b'Failed to establish a new connection: [WinError 10060] \xd0\x9f\xd0\xbe\xd0\xbf\xd1\x8b\xd1\x82\xd0\xba\xd0\xb0 \xd1\x83\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbe\xd0\xb2\xd0\xb8\xd1\x82\xd1\x8c \xd1\x81\xd0\xbe\xd0\xb5\xd0\xb4\xd0\xb8\xd0\xbd\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb5 \xd0\xb1\xd1\x8b\xd0\xbb\xd0\xb0 \xd0\xb1\xd0\xb5\xd0\xb7\xd1\x83\xd1\x81\xd0\xbf\xd0\xb5\xd1\x88\xd0\xbd\xd0\xbe\xd0\xb9, \xd1\x82.\xd0\xba. \xd0\xbe\xd1\x82 \xd0\xb4\xd1\x80\xd1\x83\xd0\xb3\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xba\xd0\xbe\xd0\xbc\xd0\xbf\xd1\x8c\xd1\x8e\xd1\x82\xd0\xb5\xd1\x80\xd0\xb0 \xd0\xb7\xd0\xb0 \xd1\x82\xd1\x80\xd0\xb5\xd0\xb1\xd1\x83\xd0\xb5\xd0\xbc\xd0\xbe\xd0\xb5 \xd0\xb2\xd1\x80\xd0\xb5\xd0\xbc\xd1\x8f \xd0\xbd\xd0\xb5 \xd0\xbf\xd0\xbe\xd0\xbb\xd1\x83\xd1\x87\xd0\xb5\xd0\xbd \xd0\xbd\xd1\x83\xd0\xb6\xd0\xbd\xd1\x8b\xd0\xb9 \xd0\xbe\xd1\x82\xd0\xba\xd0\xbb\xd0\xb8\xd0\xba, \xd0\xb8\xd0\xbb\xd0\xb8 \xd0\xb1\xd1\x8b\xd0\xbb\xd0\xbe \xd1\x80\xd0\xb0\xd0\xb7\xd0\xbe\xd1\x80\xd0\xb2\xd0\xb0\xd0\xbd\xd0\xbe \xd1\x83\xd0\xb6\xd0\xb5 \xd1\x83\xd1\x81\xd1\x82\xd0\xb0\xd0\xbd\xd0\xbe\xd0\xb2\xd0\xbb\xd0\xb5\xd0\xbd\xd0\xbd\xd0\xbe\xd0\xb5 \xd1\x81\xd0\xbe\xd0\xb5\xd0\xb4\xd0\xb8\xd0\xbd\xd0\xb5\xd0\xbd\xd0\xb8\xd0\xb5 \xd0\xb8\xd0\xb7-\xd0\xb7\xd0\xb0 \xd0\xbd\xd0\xb5\xd0\xb2\xd0\xb5\xd1\x80\xd0\xbd\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xbe\xd1\x82\xd0\xba\xd0\xbb\xd0\xb8\xd0\xba\xd0\xb0 \xd1\x83\xd0\xb6\xd0\xb5 \xd0\xbf\xd0\xbe\xd0\xb4\xd0\xba\xd0\xbb\xd1\x8e\xd1\x87\xd0\xb5\xd0\xbd\xd0\xbd\xd0\xbe\xd0\xb3\xd0\xbe \xd0\xba\xd0\xbe\xd0\xbc\xd0\xbf\xd1\x8c\xd1\x8e\xd1\x82\xd0\xb5\xd1\x80\xd0\xb0'

As I can understand encoding='utf-8' in

with codecs.open('log.txt', 'a', encoding='utf-8') as log:

should save UNICODE bytes in utf-8 code in my file, but for some reason it is ignores encoding setting... Why?

1条回答
你好瞎i
2楼-- · 2019-08-01 15:05

First: what is codec codecs.open('log.txt', 'a', encoding='utf-8')?

Second: this is not right strftime(str("%H:%M:%S %Y-%m-%d") + str(e_str_unicode) + '\n') it should be strftime("%H:%M:%S %Y-%m-%d") + e_str_unicode + '\n'

This is a short example how to do it:

from time import strftime
text = input()
print(text)

with open('log.text', 'a', encoding='utf-8') as log:
    message = strftime("%H:%M:%S %Y-%m-%d") + '=>' + text + '\n'
    log.write(message)
查看更多
登录 后发表回答