Read in .xlsx with csv module in python

2019-02-17 06:34发布

I'm trying to read in an excel file with .xlsx formatting with the csv module, but I'm not having any luck with it when using an excel file even with my dialect and encoding specified. Below, I show my different attempts and error results with the different encodings I tried. If anyone could point me into the correct coding, syntax or module I could use to read in a .xlsx file in Python, I'd appreciate it.

With the below code, I get the following error: _csv.Error: line contains NULL byte

#!/usr/bin/python

import sys, csv

with open('filelocation.xlsx', "r+", encoding="Latin1")  as inputFile:
    csvReader = csv.reader(inputFile, dialect='excel')
    for row in csvReader:
        print(row)

With the below code, I get the following error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcc in position 16: invalid continuation byte

#!/usr/bin/python

import sys, csv

with open('filelocation.xlsx', "r+", encoding="Latin1")  as inputFile:
    csvReader = csv.reader(inputFile, dialect='excel')
    for row in csvReader:
        print(row)

When I use utf-16 in the encoding, I get the following error: UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 570-571: illegal UTF-16 surrogate

1条回答
Viruses.
2楼-- · 2019-02-17 07:24

You cannot use Python's csv library for reading xlsx formatted files. You need to install and use a different library. For example, you could use xlrd as follows:

import xlrd

workbook = xlrd.open_workbook("filelocation.xlsx")
sheet = workbook.sheet_by_index(0)

for rowx in range(sheet.nrows):
    cols = sheet.row_values(rowx)
    print(cols)

This would display all of the rows in the file as lists of columns. The Python Excel website gives other possible examples.

查看更多
登录 后发表回答