My code looks like :
# -*- coding: utf-8 -*-
print ["asdf", "中文"]
print ["中文"]
print "中文"
The output in the Eclipse console is very strange:
['asdf', '\xe4\xb8\xad\xe6\x96\x87']
['\xe4\xb8\xad\xe6\x96\x87']
中文
My first question is: why did the last line get the correct output, and the others didn't?
And my second question is: how do I correct the wrong ones (to make them output real characters instead of the code that begins with "x") ?
Thank you guys!!
why did the last line get the correct output, and the others didn't?
When you print foo
, what gets printed out is str(foo)
.
However, if foo
is a list
, str(foo)
uses repr(bar)
for each element bar
, not str(bar)
.
The str
of a string is the string itself; the repr
of a string is the string inside quotes, and escaped.
how do I correct the wrong ones
If you want to print the str
of every element in a list
, you have to do that explicitly. For example:
print '[' + ', '.join(["asdf", "中文"]) + ']'
There have been sporadic proposals to change this behavior, so str
on a sequence calls str
on its members. PEP 3140 is the rejected proposal. This thread from 2009 explains the design rationale behind rejecting it.
But primarily, it's either so these don't print the same thing:
a = 'foo, bar'
b = 'foo'
c = 'bar'
print [a]
print [b, c]
Or, paraphrasing Ned Batchelder: repr
is always for geeks; str
is for humans when possible, but printing lists with their brackets and commas is already for geeks.
The first two are using the __repr__
of the strings, the last one is using the __str__
method
You could use
print ", ".join(["asdf", "中文"])