Python 3 print() function with Farsi/Arabic charac

The output will depend basically on which platform&terminal you run your code. Let's examine the below snippet for different windows terminals running either with 2.x or 3.x:

# -*- coding: utf-8 -*-
import sys

def case1(text):
    print(text)

def case2(text):
    print(text.encode("utf-8"))

def case3(text):
    sys.stdout.buffer.write(text.encode("utf-8"))

if __name__ == "__main__":
    text = "چرا کار نمیکنی؟"

    for case in [case1, case2, case3]:
        try:
            print("Running {0}".format(case.__name__))
            case(text)
        except Exception as e:
            print(e)

        print('-'*80)

Results

Python 2.x

Sublime Text 3 3122

    Running case1
    'charmap' codec can't encode characters in position 0-2: character maps to <undefined>
    --------------------------------------------------------------------------------
    Running case2
    b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda\xa9\xd9\x86\xdb\x8c\xd8\x9f'
    --------------------------------------------------------------------------------
    Running case3
    چرا کار نمیکنی؟--------------------------------------------------------------------------------

ConEmu v151205

    Running case1
    ┌åÏ▒Ïº ┌®ÏºÏ▒ ┘å┘à█î┌®┘å█îÏƒ
    --------------------------------------------------------------------------------
    Running case2
    'ascii' codec can't decode byte 0xda in position 0: ordinal not in range(128)
    --------------------------------------------------------------------------------
    Running case3
    'file' object has no attribute 'buffer'
    --------------------------------------------------------------------------------

Windows Command Prompt

    Running case1
    ┌åÏ▒Ïº ┌®ÏºÏ▒ ┘å┘à█î┌®┘å█îÏƒ
    --------------------------------------------------------------------------------

    Running case2
    'ascii' codec can't decode byte 0xda in position 0: ordinal not in range(128)
    --------------------------------------------------------------------------------

    Running case3
    'file' object has no attribute 'buffer'
    --------------------------------------------------------------------------------

Python 3.x

Sublime Text 3 3122

    Running case1
    'charmap' codec can't encode characters in position 0-2: character maps to <undefined>
    --------------------------------------------------------------------------------
    Running case2
    b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda\xa9\xd9\x86\xdb\x8c\xd8\x9f'
    --------------------------------------------------------------------------------
    Running case3
    چرا کار نمیکنی؟--------------------------------------------------------------------------------

ConEmu v151205

    Running case1
    'charmap' codec can't encode characters in position 0-2: character maps to <undefined>
    --------------------------------------------------------------------------------
    Running case2
    b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda\xa9\xd9\x86\xdb\x8c\xd8\x9f'
    --------------------------------------------------------------------------------
    Running case3
    ┌åÏ▒Ïº ┌®ÏºÏ▒ ┘å┘à█î┌®┘å█îÏƒ--------------------------------------------------------------------------------

Windows Command Prompt

    Running case1
    'charmap' codec can't encode characters in position 0-2: character maps to <unde
    fined>
    --------------------------------------------------------------------------------

    Running case2
    b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda
    \xa9\xd9\x86\xdb\x8c\xd8\x9f'
    --------------------------------------------------------------------------------

    Running case3
    ┌åÏ▒Ïº ┌®ÏºÏ▒ ┘å┘à█î┌®┘å█îÏƒ----------------------------------------------------
    ----------------------------

As you can see just using sublime text3 terminal (case3) worked alright. The other terminals didn't support persian. The main point here is, it depends which terminal & platform you're using.

Solution (ConEmu specific)

Modern terminals like ConEmu allows you to work with UTF8-Encoding as explained here, so, let's try:

chcp 65001 & cmd

And then running again the script against 2.x & 3.x:

Python2.x

Running case1
��را کار نمیکنی؟[Errno 0] Error
--------------------------------------------------------------------------------
Running case2
'ascii' codec can't decode byte 0xda in position 0: ordinal not in range(128)
--------------------------------------------------------------------------------
Running case3
'file' object has no attribute 'buffer'
--------------------------------------------------------------------------------

Python3.x

Running case1
چرا کار نمیکنی؟
--------------------------------------------------------------------------------
Running case2
b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda\xa9\xd9\x86\xdb\x8c\xd8\x9f'
--------------------------------------------------------------------------------
Running case3
چرا کار نمیکنی؟--------------------------------------------------------------------------------

As you can see, now the output was succesfull with python3 case1 (print). So... moral of a fable... learn more about your tools and how to configure them properly for your use-cases ;-)

回答3:

I can't reproduce the problem. Here is my script p.py:

text = "چرا کار نمیکنی؟"
print(text)

And the result of python3 p.py:

چرا کار نمیکنی؟

Are you sure you're using python 3 ? With python2 p.py:

SyntaxError: Non-ASCII character '\xda' in file p.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

回答4:

And if you do the text.encode("utf-8")-part, it will show as b'\xda\x86\xd8\xb1\xd8\xa7 \xda\xa9\xd8\xa7\xd8\xb1 \xd9\x86\xd9\x85\xdb\x8c\xda\xa9\xd9\x86\xdb\x8c\xd8\x9f' (at my machine).

EDIT Sorry for the edit, but I can't comment (because not enough reputation)

Even on python 2.7, the print(text) does work. Check out this link here, which I just generated.