Error reading csv file unicodeescape

2019-09-16 01:40发布

问题:

I have this program

import csv

with open("C:\Users\frederic\Desktop\WinPython-64bit-3.4.4.3Qt5\notebooks\scores.txt","r") as scoreFile:
    # write = w, read = r, append = a
    scoreFileReader = csv.reader(scoreFile)
    scoreList = []
    for row in scoreFileReader:
        if len (row) != 0:
            scoreList = scoreList + [row]

scoreFile.close()

print(scoreList)

Why do I get this Error ?

with open("C:\Users\frederic\Desktop\WinPython-64bit-3.4.4.3Qt5\notebooks\scores.txt","r") as scoreFile: ^ SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape.

回答1:

You need to use raw strings with Windows-style filenames:

with open(r"C:\Users\frederic\Desktop\WinPython-64bit-3.4.4.3Qt5\notebooks\scores.txt", 'r') as scoreFile:
         ^^

Otherwise, Python's string engine thinks that \U is the start of a Unicode escape sequence - which of course it isn't in this case.


Also, be careful also your scoreFile.close() is useless:

The with statement replace a try and catch. It also automatically close the file after the block. That mean you can delete the scoreFile.close() line.

Also, you can change the line if len(row) != 0 According to PEP8:

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq: if seq:

No: if len(seq): if not len(seq):

One last thing, your for loop isn't good either, to read csv you better start copying an example from the doc first!

with open('eggs.csv', 'rb') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
    for row in spamreader:
       print ', '.join(row)


回答2:

you may try this:

import csv
import io

def guess_encoding(csv_file):
    """guess the encoding of the given file"""
    with io.open(csv_file, "rb") as f:
        data = f.read(5)
    if data.startswith(b"\xEF\xBB\xBF"):  # UTF-8 with a "BOM"
        return ["utf-8-sig"]
    elif data.startswith(b"\xFF\xFE") or data.startswith(b"\xFE\xFF"):
        return ["utf-16"]
    else:  # in Windows, guessing utf-8 doesn't work, so we have to try
        try:
            with io.open(csv_file, encoding="utf-8") as f:
                preview = f.read(222222)
                return ["utf-8"]
        except:
            return [locale.getdefaultlocale()[1], "utf-8"]

encodings = guess_encoding(r"C:\Users\frederic\Desktop\WinPython-64bit-3.4.4.3Qt5\notebooks\scores.txt")[0])

# then your code with 
with open(r"C:\Users\frederic\Desktop\WinPython-64bit-3.4.4.3Qt5\notebooks\scores.txt","r", encoding=encodings[0]) as scoreFile: