Errors when trying to save command line output to

2019-02-20 00:16发布

I was running a python tool and trying to save its output to a file. If I don't save the output to a file, the tool runs perfectly fine. But when I try to save the output to the file, it throws following error and interrupt the program:

  File "./androdiff.py", line 118, in <module>
main(options, arguments)
  File "./androdiff.py", line 94, in main
ddm.show()
  File "./elsim/elsim/elsim_dalvik.py", line 772, in show
self.eld.show()
  File "./elsim/elsim/elsim.py", line 435, in show
i.show()
  File "./elsim/elsim/elsim_dalvik.py", line 688, in show
  print hex(self.bb.bb.start + self.offset), self.pos_instruction, self.ins.get_name(), self.ins.show_buff( self.bb.bb.start + self.offset )
  UnicodeEncodeError: 'ascii' codec can't encode character u'\u0111' in position 35: ordinal not in range(128)

I've tried command |less , command > output and command | tee output, all of them will throw such error.

Please help to resolve the issue.

Thanks!

2条回答
该账号已被封号
2楼-- · 2019-02-20 01:15

Set PYTHONIOENCODING environment variable explicitly if stdout character encoding can't be determined automatically e.g., when the output is redirected to a file:

$ PYTHONIOENCODING=utf-8 python app.py > file

Don't hardcode the character encoding in your scripts if the output may go to a terminal; print Unicode strings instead and let users to configure their environment.

查看更多
干净又极端
3楼-- · 2019-02-20 01:16

You will want to specify the encoding of your string before you print it:

print unicode(hex(self.bb.bb.start + self.offset)).encode('utf-8')
print unicode(self.pos_instruction, self.ins.get_name()).encode('utf-8')
print unicode(self.ins.show_buff( self.bb.bb.start + self.offset )).encode('utf-8')

The reason this works is because python automatically encodes your string correctly (in your case utf-8) when printing to the terminal (it detects that the terminal uses utf-8).

When you are redirecting your output to a file instead, Python has no information about what encoding it should use and it defaults to ascii instead (which is causing your error).

As a general rule of thumb, make sure you always encode your string before printing to make print work in all environments.

The best method may be to define your own print method for this:

def myprint(unicodestr): 
    print unicodestr.encode('utf-8')

If you want to avoid the above and make printing with utf-8 encoding the default you can do

import sys
import codecs
sys.stdout=codecs.getwriter('utf-8')(sys.stdout)

Beware of this approach! Some third-party libraries may depend on the default encoding being ascii and break. Note that this whole mess has been resolved in Python 3 (which defaults to UTF-8 encoding)

查看更多
登录 后发表回答