I'm trying to use the csv module to read a utf-8 csv file, and I have some trouble to create a generic code for python 2 and 3 due to encoding.
Here is the original code in Python 2.7:
with open(filename, 'rb') as csvfile:
csv_reader = csv.reader(csvfile, quotechar='\"')
langs = next(csv_reader)[1:]
for row in csv_reader:
pass
But when I run it with python 3, it doesn't like the fact that I open the file without "encoding". I tried this:
with codecs.open(filename, 'r', encoding='utf-8') as csvfile:
csv_reader = csv.reader(csvfile, quotechar='\"')
langs = next(csv_reader)[1:]
for row in csv_reader:
pass
Now python 2 can't decode the line in the "for" loop. So... how should I do it ?
Update: While the code in my original answer works I meanwhile release a small package at https://pypi.python.org/pypi/csv342 that provides a Python 3 like interface for Python 2. So independent of your Python version you can simply do an
Original answer: Here's a solution that even with Python 2 actually decodes the text to Unicode strings and consequently works with encodings other than UTF-8.
The code below defines a function
csv_rows()
that returns the contents of a file as sequence of lists. Example usage:Here are the two variants for
csv_rows()
: one for Python 3+ and another for Python 2.6+. During runtime it automatically picks the proper variant.UTF8Recoder
andUnicodeReader
are verbatim copies of the examples in the Python 2.7 library documentation.Indeed, in Python 2 the file should be opened in binary mode, but in Python 3 in text mode. Also in Python 3
newline=''
should be specified (which you forgot).You'll have to do the file opening in an if-block.
Old Question I know, but I was looking on how to do this. Just in case someone comes over this and might find it useful.
This is how i solved mine, thanks Lennart Regebro for the hint. :
then do what you need to do: