Using Python, I need to be able to do the following operations to a workbook for excel 2007:
1.delete colums
I am looking into xlrd; however.
Can anyone please tell me how could do this?
Using Python, I need to be able to do the following operations to a workbook for excel 2007:
1.delete colums
I am looking into xlrd; however.
Can anyone please tell me how could do this?
If you're working in Python 3.x, you'll find a lot of trouble using the xlrd
/xlwt
/xlutils
family, as they're modules for Python 2.
You might consider openpyxl for working with Excel 2007 .xlsx files in Python 3.
If you just need to shift values over (disregarding formatting, etc) here's one way to do it. You can build on this:
from openpyxl import load_workbook
from openpyxl.cell import column_index_from_string as col_index
from openpyxl.cell import get_column_letter as col_letter
def del_col(s, col, cmax=None, rmax=None):
col_num = col_index(col) - 1
cols = s.columns
if isinstance(cmax, str):
cmax = col_index(cmax)
cmax = cmax or s.get_highest_column()
rmax = rmax or s.get_highest_row()
for c in range(col_num, cmax - 1):
# print("working on column %i" % c)
for r in range(0, rmax):
cols[c][r].value = cols[c+1][r].value
for r in range(0, rmax):
cols[c+1][r].value = ''
return s
if __name__ == '__main__':
wb = load_workbook('input_book.xlsx')
ws = wb.active
# or by name: ws = wb['SheetName']
col = 'D'
del_col(ws, col)
wb.save('output_book.xlsx')
This deletes a column from an xlsx worksheet using openpyxl 2.3.3. You can specify either a column number or letter:
import openpyxl.cell
def delete_column(ws, delete_column):
if isinstance(delete_column, str):
delete_column = openpyxl.cell.column_index_from_string(delete_column)
assert delete_column >= 1, "Column numbers must be 1 or greater"
for column in range(delete_column, ws.max_column + 1):
for row in range(1, ws.max_row + 1):
ws.cell(row=row, column=column).value = \
ws.cell(row=row, column=column+1).value
On its final iteration this copies None
values from column ws.max_column+1
to column ws.max_column
, wiping out the values that used to be there. While the values in the cells are correct, unfortunately ws.max_column
does not decrement.
In other answers there is talk of using Worksheet.garbage_collect()
to reset ws.max_column
, but I could only find the private method Worksheet._garbage_collect()
, so I did not use it.
UPDATE: In the end I found that deleting a lot of columns was inefficient. Hiding them is a superior solution. This preserves formatting and even more importantly, maintains the integrity of formulae that reference the hidden cells:
def hide_column(ws, column_id):
if isinstance(column_id, int):
assert column_id >= 1, "Column numbers must be 1 or greater"
column_id = openpyxl.cell.get_column_letter(column_id)
column_dimension = ws.column_dimensions[column_id]
column_dimension.hidden = True