Redirecting python's stdout to the file fails

2020-02-10 05:46发布

I have a python script that connects to the Twitter Firehose and sends data downstream for processing. Before it was working fine, but now I'm trying to get only text body. (It's not a question about how I should extract data from Twitter or how do encode/decode ascii characters). So when I launch my script directly like this:

python -u fetch_script.py

It works just fine, and I can see messages are coming to the screen. For example:

root@domU-xx-xx-xx-xx:/usr/local/streaming# python -u fetch_script.py 
Cuz I'm checking you out >on Facebook<
RT @SearchlightNV: #BarryLies                

1条回答
beautiful°
2楼-- · 2020-02-10 06:43

Since nobody's jumped in yet, here's my shot. Python sets stdout's encoding when writing to a console but not when writing to a file. This script shows the problem

import sys

msg = {'text':u'\2026'}
sys.stderr.write('default encoding: %s\n' % sys.stdout.encoding)
print msg['text']

Running shows the error

$ python bad.py>/tmp/xxx
default encoding: None
Traceback (most recent call last):
  File "fix.py", line 5, in <module>
    print msg['text']
UnicodeEncodeError: 'ascii' codec can't encode character u'\x82' in position 0: ordinal not in range(128)

Add the encoding

import sys

msg = {'text':u'\2026'}
sys.stderr.write('default encoding: %s\n' % sys.stdout.encoding)
encoding = sys.stdout.encoding or 'utf-8'
print msg['text'].encode(encoding)

and the problem is solved

$ python good.py >/tmp/xxx
default encoding: None
$ cat /tmp/xxx
6
查看更多
登录 后发表回答