Python 3 UnicodeEncodeError: 'ascii' codec

2019-04-26 18:10发布

问题:

I've just started to learn Python but I already ran into troubles.
I have a simple script with just one command:

#!/usr/bin/env python3
print("Příliš žluťoučký kůň úpěl ďábelské ódy.") # Text in Czech 

When I try to run this script:

python3 hello.py 

I get this message:

Traceback (most recent call last):
  File "hello.py", line 2, in <module>
    print("P\u0159\xedli\u0161 \u017elu\u0165ou\u010dk\xfd k\u016fn \xfap\u011bl \u010f\xe1belsk\xe9 \xf3dy.")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

I am using Kubuntu 16.04 and Python 3.5.2. When I tried this: export PYTHONIOENCODING=utf-8 It worked but only temporarily. Next time I opened bash I got the same error.

According to https://docs.python.org/3/howto/unicode.html#the-string-type the default encoding for Python source code is UTF-8.
So I have the source file saved id UTF-8, Konsole is set to UTF-8 but I still get the error!
Even if I add

# -*- coding: utf-8 -*-

to the beginning it does nothing.

Another weird thing: when I run it using only python, not python3, it works. How is it possible to work in Python 2.7.12 and not in 3.5.2?

Any ideas for solving this permanently? Thank you.

回答1:

Thanks to Mark Tolen and Alastair McCormack for suggesting where the problem may be. The problem was really in the locale settings.
When I ran locale, the output was:

LANG=C
LANGUAGE=
LC_CTYPE="C"
LC_NUMERIC=cs_CZ.UTF-8
LC_TIME=cs_CZ.UTF-8
LC_COLLATE=cs_CZ.UTF-8
LC_MONETARY=cs_CZ.UTF-8
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT=cs_CZ.UTF-8
LC_IDENTIFICATION="C"
LC_ALL=

This "C" is the default setting which uses the ANSI charmap. And that is where the problem was. Running locale charmap gave me: ANSI_X3.4-1968 which can not display non-English characters.
I fixed this using this Ubuntu documentation site.

I added these lines to /etc/default/locale:

LANGUAGE=cs_CZ.UTF-8
LC_ALL=cs_CZ.UTF-8

Then you have to restart your session (log out and in) to apply these settings.

Running locale now returns this output:

LANG=C
LANGUAGE=cs
LC_CTYPE="cs_CZ.UTF-8"
LC_NUMERIC="cs_CZ.UTF-8"
LC_TIME="cs_CZ.UTF-8"
LC_COLLATE="cs_CZ.UTF-8"
LC_MONETARY="cs_CZ.UTF-8"
LC_MESSAGES="cs_CZ.UTF-8"
LC_PAPER="cs_CZ.UTF-8"
LC_NAME="cs_CZ.UTF-8"
LC_ADDRESS="cs_CZ.UTF-8"
LC_TELEPHONE="cs_CZ.UTF-8"
LC_MEASUREMENT="cs_CZ.UTF-8"
LC_IDENTIFICATION="cs_CZ.UTF-8"
LC_ALL=cs_CZ.UTF-8

and running locale charmap returns:

UTF-8