Insert row into Excel spreadsheet using openpyxl i

2019-01-09 12:24发布

I'm looking for the best approach for inserting a row into a spreadsheet using openpyxl.

Effectively, I have a spreadsheet (Excel 2007) which has a header row, followed by (at most) a few thousand rows of data. I'm looking to insert the row as the first row of actual data, so after the header. My understanding is that the append function is suitable for adding content to the end of the file.

Reading the documentation for both openpyxl and xlrd (and xlwt), I can't find any clear cut ways of doing this, beyond looping through the content manually and inserting into a new sheet (after inserting the required row).

Given my so far limited experience with Python, I'm trying to understand if this is indeed the best option to take (the most pythonic!), and if so could someone provide an explicit example. Specifically can I read and write rows with openpyxl or do I have to access cells? Additionally can I (over)write the same file(name)?

10条回答
Root(大扎)
2楼-- · 2019-01-09 12:42

Edited Nick's solution, this version takes a starting row, the number of rows to insert, and a filename, and inserts the necessary number of blank rows.

#! python 3

import openpyxl, sys

my_start = int(sys.argv[1])
my_rows = int(sys.argv[2])
str_wb = str(sys.argv[3])

wb = openpyxl.load_workbook(str_wb)
old_sheet = wb.get_sheet_by_name('Sheet')
mcol = old_sheet.max_column
mrow = old_sheet.max_row
old_sheet.title = 'Sheet1.5'
wb.create_sheet(index=0, title='Sheet')

new_sheet = wb.get_sheet_by_name('Sheet')

for row_num in range(1, my_start):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = row_num, column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

for row_num in range(my_start + my_rows, mrow + my_rows):
    for col_num in range(1, mcol + 1):
        new_sheet.cell(row = (row_num + my_rows), column = col_num).value = old_sheet.cell(row = row_num, column = col_num).value

wb.save(str_wb)
查看更多
Viruses.
3楼-- · 2019-01-09 12:43

I took Dallas solution and added support for merged cells:

    def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
        skip_list = []
        try:
            idx = row_idx - 1 if above else row_idx
            for (new, old) in zip(range(self.max_row+cnt,idx+cnt,-1),range(self.max_row,idx,-1)):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  print("Copying %s%d to %s%d."%(col,old,col,new))
                  source = self["%s%d"%(col,old)]
                  target = self["%s%d"%(col,new)]
                  if source.coordinate in skip_list:
                      continue

                  if source.coordinate in self.merged_cells:
                      # This is a merged cell
                      for _range in self.merged_cell_ranges:
                          merged_cells_list = [x for x in cells_from_range(_range)][0]
                          if source.coordinate in merged_cells_list:
                              skip_list = merged_cells_list
                              self.unmerge_cells(_range)
                              new_range = re.sub(str(old),str(new),_range)
                              self.merge_cells(new_range)
                              break

                  if source.data_type == Cell.TYPE_FORMULA:
                    target.value = re.sub(
                      "(\$?[A-Z]{1,3})%d"%(old),
                      lambda m: m.group(1) + str(new),
                      source.value
                    )
                  else:
                    target.value = source.value
                  target.number_format = source.number_format
                  target.font   = source.font.copy()
                  target.alignment = source.alignment.copy()
                  target.border = source.border.copy()
                  target.fill   = source.fill.copy()
            idx = idx + 1
            for row in range(idx,idx+cnt):
                for c_idx in range(1,self.max_column):
                  col = self.cell(row=1, column=c_idx).column #get_column_letter(c_idx)
                  #print("Clearing value in cell %s%d"%(col,row))
                  cell = self["%s%d"%(col,row)]
                  cell.value = None
                  source = self["%s%d"%(col,row-1)]
                  if copy_style:
                    cell.number_format = source.number_format
                    cell.font      = source.font.copy()
                    cell.alignment = source.alignment.copy()
                    cell.border    = source.border.copy()
                    cell.fill      = source.fill.copy()
                  if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                    #print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                    cell.value = re.sub(
                      "(\$?[A-Z]{1,3})%d"%(row - 1),
                      lambda m: m.group(1) + str(row),
                      source.value
                    )
查看更多
贪生不怕死
4楼-- · 2019-01-09 12:45

== Updated to a fully functional version, based on feedback here: groups.google.com/forum/#!topic/openpyxl-users/wHGecdQg3Iw. ==

As the others have pointed out, openpyxl does not provide this functionality, but I have extended the Worksheet class as follows to implement inserting rows. Hope this proves useful to others.

def insert_rows(self, row_idx, cnt, above=False, copy_style=True, fill_formulae=True):
    """Inserts new (empty) rows into worksheet at specified row index.

    :param row_idx: Row index specifying where to insert new rows.
    :param cnt: Number of rows to insert.
    :param above: Set True to insert rows above specified row index.
    :param copy_style: Set True if new rows should copy style of immediately above row.
    :param fill_formulae: Set True if new rows should take on formula from immediately above row, filled with references new to rows.

    Usage:

    * insert_rows(2, 10, above=True, copy_style=False)

    """
    CELL_RE  = re.compile("(?P<col>\$?[A-Z]+)(?P<row>\$?\d+)")

    row_idx = row_idx - 1 if above else row_idx

    def replace(m):
        row = m.group('row')
        prefix = "$" if row.find("$") != -1 else ""
        row = int(row.replace("$",""))
        row += cnt if row > row_idx else 0
        return m.group('col') + prefix + str(row)

    # First, we shift all cells down cnt rows...
    old_cells = set()
    old_fas   = set()
    new_cells = dict()
    new_fas   = dict()
    for c in self._cells.values():

        old_coor = c.coordinate

        # Shift all references to anything below row_idx
        if c.data_type == Cell.TYPE_FORMULA:
            c.value = CELL_RE.sub(
                replace,
                c.value
            )
            # Here, we need to properly update the formula references to reflect new row indices
            if old_coor in self.formula_attributes and 'ref' in self.formula_attributes[old_coor]:
                self.formula_attributes[old_coor]['ref'] = CELL_RE.sub(
                    replace,
                    self.formula_attributes[old_coor]['ref']
                )

        # Do the magic to set up our actual shift    
        if c.row > row_idx:
            old_coor = c.coordinate
            old_cells.add((c.row,c.col_idx))
            c.row += cnt
            new_cells[(c.row,c.col_idx)] = c
            if old_coor in self.formula_attributes:
                old_fas.add(old_coor)
                fa = self.formula_attributes[old_coor].copy()
                new_fas[c.coordinate] = fa

    for coor in old_cells:
        del self._cells[coor]
    self._cells.update(new_cells)

    for fa in old_fas:
        del self.formula_attributes[fa]
    self.formula_attributes.update(new_fas)

    # Next, we need to shift all the Row Dimensions below our new rows down by cnt...
    for row in range(len(self.row_dimensions)-1+cnt,row_idx+cnt,-1):
        new_rd = copy.copy(self.row_dimensions[row-cnt])
        new_rd.index = row
        self.row_dimensions[row] = new_rd
        del self.row_dimensions[row-cnt]

    # Now, create our new rows, with all the pretty cells
    row_idx += 1
    for row in range(row_idx,row_idx+cnt):
        # Create a Row Dimension for our new row
        new_rd = copy.copy(self.row_dimensions[row-1])
        new_rd.index = row
        self.row_dimensions[row] = new_rd
        for col in range(1,self.max_column):
            col = get_column_letter(col)
            cell = self.cell('%s%d'%(col,row))
            cell.value = None
            source = self.cell('%s%d'%(col,row-1))
            if copy_style:
                cell.number_format = source.number_format
                cell.font      = source.font.copy()
                cell.alignment = source.alignment.copy()
                cell.border    = source.border.copy()
                cell.fill      = source.fill.copy()
            if fill_formulae and source.data_type == Cell.TYPE_FORMULA:
                s_coor = source.coordinate
                if s_coor in self.formula_attributes and 'ref' not in self.formula_attributes[s_coor]:
                    fa = self.formula_attributes[s_coor].copy()
                    self.formula_attributes[cell.coordinate] = fa
                # print("Copying formula from cell %s%d to %s%d"%(col,row-1,col,row))
                cell.value = re.sub(
                    "(\$?[A-Z]{1,3}\$?)%d"%(row - 1),
                    lambda m: m.group(1) + str(row),
                    source.value
                )   
                cell.data_type = Cell.TYPE_FORMULA

    # Check for Merged Cell Ranges that need to be expanded to contain new cells
    for cr_idx, cr in enumerate(self.merged_cell_ranges):
        self.merged_cell_ranges[cr_idx] = CELL_RE.sub(
            replace,
            cr
        )

Worksheet.insert_rows = insert_rows
查看更多
Viruses.
5楼-- · 2019-01-09 12:46

This worked for me:

    openpyxl.worksheet.worksheet.Worksheet.insert_rows(wbs,idx=row,amount=2)

Insert 2 rows before row==idx

See: http://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html

查看更多
我只想做你的唯一
6楼-- · 2019-01-09 12:47

As of openpyxl 1.5 you can now use .insert_rows(idx, row_qty)

from openpyxl import load_workbook
wb = load_workbook('excel_template.xlsx')
ws = wb.active
ws.insert_rows(14, 10)

It will not pick up the formatting of the idx row as it would if you did this manually in Excel. you will have apply the correct formatting i.e. cell color afterwards.

查看更多
forever°为你锁心
7楼-- · 2019-01-09 12:48

Unfortunately there isn't really a better way to do in that read in the file, and use a library like xlwt to write out a new excel file (with your new row inserted at the top). Excel doesn't work like a database that you can read and and append to. You unfortunately just have to read in the information and manipulate in memory and write out to what is essentially a new file.

查看更多
登录 后发表回答