2.7 CSV module wants unicode, but doesn't want

2019-04-05 04:01发布

问题:

csvfile_ = open(finishedFileName+num+".csv","w",newline='')
writ = csv.writer(csvfile_, dialect='excel')
firstline = unicode(str(firstline))
try:
    writ.writerow(firstline)
except TypeError:
    print firstline
    print type(firstline)
    raise

I get a TypeError: must be unicode, not str with this code. When printing the type of firstline, I see <type 'unicode'>. When I print firstline, I see ['project_number', 'project_location'](The list is longer than that, but it continues in that style.)

This program was working fine in python 3.3. I ported it over with 3to2, switching from unix to windows as I did so.

How do I make this program write smoothly?

Note: This version of the csv module doesn’t support Unicode input according to the official documentation, but it told me to give it Unicode input anyway.

Full exception

Traceback (most recent call last):
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 382, in <module>
    process(marketingLogExportFileName)
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 123, in process
    writing(csvfile,modified,firstline)
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 114, in writing
    writ.writerow(firstline)
TypeError: must be unicode, not str

If I take out the code to make firstline unicode, I instead get

Traceback (most recent call last):
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 382, in <module>
    process(marketingLogExportFileName)
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 123, in process
    writing(csvfile_,modified,firstline)
  File "C:\Users\urightswt\Downloads\LogModToConvert.py", line 114, in writing
    writ.writerow(firstline)
TypeError: must be unicode, not str

回答1:

Unfortunately, 3to2 used the io.open() call instead of the built-in Python 2 open() function. This opened the file in text mode, which like on Python 3 expects Unicode input.

However, the csv module does not support Unicode data; it certainly does not produce Unicode.

You'll either have to open the file in binary mode on Python 2:

mode = 'w'
if sys.version_info.major < 3:
    mode += 'b'
csvfile_ = open(finishedFileName + num + ".csv", mode, newline='')

or use the built-in open() call instead:

csvfile_ = open(finishedFileName + num + ".csv", 'wb')

where you have to use 'wb' as the mode anyway.

If you are trying to write out unicode data, you'll have to encode that data before passing it to the csv.writer() object. The csv module examples section includes code to make encoding from Unicode before writing a little easier.



回答2:

I had the same problem with open() and csv. A friend gave me the solution, which is to use open_output() instead of open(). open_output() defaults to "wb" instead of text.



回答3:

Martijn Pieters' solution using 'w' or 'wb' does not seem to work because of the newline argument. I personally get a ValueError.

ValueError: binary mode doesn't take a newline argument

Which I don't really understand, I would expect io to ignore it rather than raise an Exception. The only solution that works for me both on python 2 and 3 is:

if sys.version_info.major < 3:
    open(my_csv_file, 'rb')
else:
    open(my_csv_file, 'r', newline='')

Solution that can become very heavy when you open a lot of files. Martijn solution was cleaner in that regard, if only it could work!

EDIT: I think the cleanest working solution when developing a package that needs to read/write files often is to create a small utility function that can be called everywhere in the package:

import sys
import io

def open_csv_rb(my_file):
    if sys.version_info[0] < 3:
        return io.open(my_file, 'rb')
    else:
        return io.open(my_file, 'r', encoding='utf8')

def open_csv_wb(my_file):
    if sys.version_info[0] < 3:
        return io.open(my_file, 'wb')
    else:
        return io.open(my_file, 'w', newline='', encoding='utf8')