可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I need to compare a unicode string coming from a utf-8 file with a constant defined in the Python script.

I'm using Python 2.7.6 on Linux.

If I run the above script within Spyder (a Python editor) I got it working, but if I invoke the Python script from a terminal, I got the test failing. Do I need to import/define something in the terminal before invoking the script?

Script ("pythonscript.py"):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv

some_french_deps = []
idata_raw = csv.DictReader(open("utf8_encoded_data.csv", 'rb'), delimiter=";")
for rec in idata_raw:
    depname = unicode(rec['DEP'],'utf-8')
    some_french_deps.append(depname)

test1 = "Tarn"
test2 = "Rhône-Alpes"
if test1==some_french_deps[0]:
  print "Tarn test passed"
else:
  print "Tarn test failed"
if test2==some_french_deps[2]:
  print "Rhône-Alpes test passed"
else:
  print "Rhône-Alpes test failed"

utf8_encoded_data.csv:

DEP
Tarn
Lozère
Rhône-Alpes
Aude

Run output from Spyder editor:

Tarn test passed
Rhône-Alpes test passed

Run output from terminal:

$ ./pythonscript.py 
Tarn test passed
./pythonscript.py:20: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  if test2==some_french_deps[2]:
Rhône-Alpes test failed

回答1:

You are comparing a byte string (type str) with a unicode value. Spyder has changed the default encoding from ASCII to UTF-8, and Python does an implicit conversion between byte strings and unicode values when comparing the two types. Your byte strings are encoded to UTF-8, so under Spyder that comparison succeeds.

The solution is to not use byte strings, use unicode literals for your two test values instead:

test1 = u"Tarn"
test2 = u"Rhône-Alpes"

Changing the system default encoding is, in my opinion, a terrible idea. Your code should use Unicode correctly instead of relying on implicit conversions, but to change the rules of implicit conversions only increases the confusion, not make the task any easier.