I have a list of excel files with similar last row. It contains private information about client (his name, surname, phone). Each excel file corresponds to a client. I need to make one excel file with all data about every client. I decide to do it automatically, so looked to openpyxl
library. I wrote the following code, but it doesn't work correctly.
import openpyxl
import os
import glob
from openpyxl import load_workbook
from openpyxl import Workbook
import openpyxl.styles
from openpyxl.cell import get_column_letter
path_kit = 'prize_input/kit'
#creating single document
prize_info = Workbook()
prize_sheet = prize_info.active
file_array_reciever = []
for file in glob.glob(os.path.join(path_kit, '*.xlsx')):
file_array_reciever.append(file)
row_num = 1
for f in file_array_reciever:
f1 = load_workbook(filename=f)
sheet = f1.active
for col_num in range (3, sheet.max_column):
prize_sheet.cell(row=row_num, column=col_num).value = \
sheet.cell(row=sheet.max_row, column=col_num).value
prize_info.save("Ex.xlsx")
I get this error:
Traceback (most recent call last):
File "/Users/zkid18/PycharmProjects/untitled/excel_test.py", line 43, in <module>
f1 = load_workbook(filename=f)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 183, in load_workbook
wb.active = read_workbook_settings(archive.read(ARC_WORKBOOK)) or 0
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1229, in read
with self.open(name, "r", pwd) as fp:
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1252, in open
zinfo = self.getinfo(name)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1196, in getinfo
'There is no item named %r in the archive' % name)
KeyError: "There is no item named 'xl/workbook.xml' in the archive"
Looks like it is a problem with reading file.
I don't understand where it gets an item named 'xl/workbook.xml'
in the archive.
Depending on which version you are using, this could be a bug in openpyxl. For example, in 1.6.1 a bug was introduced exhibiting this behavior. Reverting to 1.5.8 fixed it. There was a fix according to this openpyxl ticket; though the ticket doesn't say when the fix was delivered, it was committed in early 2013. I upgraded to 1.6.2 and the error went away.
You can use xlrd biblioteque
This script allow you to transform a excel data to list of dictionnaries
import xlrd
workbook = xlrd.open_workbook('your_file.xlsx')
workbook = xlrd.open_workbook('your_file.xlsx', on_demand = True)
worksheet = workbook.sheet_by_index(0)
first_row = [] # The row where we stock the name of the column
for col in range(worksheet.ncols):
first_row.append( worksheet.cell_value(0,col) )
# tronsform the workbook to a list of dictionnary
data =[]
for row in range(1, worksheet.nrows):
elm = {}
for col in range(worksheet.ncols):
elm[first_row[col]]=worksheet.cell_value(row,col)
data.append(elm)
print data
I guess your file is .xls format before, you can use
try:
f1 = load_workbook(filename=f)
except:
print f
to find which file cause this error and reopen it in Excel, then save as .xlsx.
I found this post searching for a solution to a similar issue,
("There is no item named '[Content_Types].xml' in the archive")
None of this error message makes any sense in terms of my script or the file.
My script adds 1 sheet and updates five more in an existing Excel document.
While my script was running, I realized I had an error in my code. I canceled my script mid-running.
After canceling, the existing Excel file exhibited this error.
Working out bugs with the script, maybe you corrupted your Excel file??
To address this, I'm thinking of creating a temporary restore file in the event of an error using OpenPyXl.
I has the same issue, make sure the file you're trying to read isn't open in Excel already
If openpyxl still doesn't work, using pandas works.
$ pip install pandas xlrd
And this code works:
import pandas as pd
df = pd.read_excel(file_path)
Option 1:
I have overcome this issue by adding read_only=True
: Specifically, replace
f1 = load_workbook(filename=f)
with
f1 = load_workbook(filename=f, read_only=True)
Note: Depending on your code,read_only=True
can make your code very slow. If this is the case for you, you may want to try option 2.
Option 2: Open your problematic workbook in excel, and then re-save it as a Strict Open XML Spreadsheet (*.xlsx)