I need import data from a csv in my project and i need a object like DictReader, but with full utf8 supports, anyone knows a module or app with this?
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
- Correctly parse PDF paragraphs with Python
As the answer to this post said :
You can see below my example code. I'm using your csv file (see comments).
Ouput:
You can see that the 'Ñ' is correctly encoded.
Your data is NOT encoded in UTF-8. It is (mostly) encoded in cp1252. The data appears to include Spanish names. The most prevalent non-ASCII character is '\xd1` (i.e. Latin capital letter N with tilde) -- this is the character that caused the exception.
One of the non-ASCII characters in the file is '\x8d'. It is NOT in cp1252. It appears where the letter A should appear in the name VASQUEZ. Of the others, '\x94' (curly double quote in cp1252) appears in the middle of a name. The remaining ones may also represent errors.
I suggest that you run this little code fragment to print lines with suspicious characters in them:
and fix up the data.
Then you need a csv DictReader with full and generalised decoding support. Full means decoding the fieldnames aka dict keys as well as the data. Generalised means no hardcoding of the encoding.
import csv
Output:
and here is what you get with your sample file (first data row only, Python 2.7.1, Windows 7):