I've got a python script that outputs unicode to the console, and I'd like to redirect it to a file. Apparently, the redirect process in python involves converting the output to a string, so I get errors about inability to decode unicode characters.
So then, is there any way to perform a redirect into a file encoded in UTF-8?
When printing to the console, Python looks at
sys.stdout.encoding
to determine the encoding to use to encode unicode objects before printing.When redirecting output to a file,
sys.stdout.encoding
is None, so Python2 defaults to theascii
encoding. (In contrast, Python3 defaults toutf-8
.) This often leads to an exception when printing unicode.You can avoid the error by explicitly encoding the unicode yourself before printing:
or you could redefine
sys.stdout
so all output is encoded inutf-8
:Under Linux, you can use tee and redirect stderr to /dev/null.
You also don't need to modify your Python script.
Set the environment variable
PYTHONIOENCODING
to the encoding you want before redirecting a python script to a file. Then you won't have to modify the original script. Make sure to write Unicode strings as well, otherwisePYTHONIOENCODING
will have no effect. If you write byte strings, the bytes are sent as-is to the terminal (or redirected file).This should do the job.