I currently have this code. It works perfectly.
It loops through excel files in a folder, removes the first 2 rows, then saves them as individual excel files, and it also saves the files in the loop as an appended file.
Currently the appended file overwrites the existing file each time I run the code.
I need to append the new data to the bottom of the already existing excel sheet ('master_data.xlsx)
dfList = []
path = 'C:\\Test\\TestRawFile'
newpath = 'C:\\Path\\To\\New\\Folder'
for fn in os.listdir(path):
# Absolute file path
file = os.path.join(path, fn)
if os.path.isfile(file):
# Import the excel file and call it xlsx_file
xlsx_file = pd.ExcelFile(file)
# View the excel files sheet names
xlsx_file.sheet_names
# Load the xlsx files Data sheet as a dataframe
df = xlsx_file.parse('Sheet1',header= None)
df_NoHeader = df[2:]
data = df_NoHeader
# Save individual dataframe
data.to_excel(os.path.join(newpath, fn))
dfList.append(data)
appended_data = pd.concat(dfList)
appended_data.to_excel(os.path.join(newpath, 'master_data.xlsx'))
I thought this would be a simple task, but I guess not. I think I need to bring in the master_data.xlsx file as a dataframe, then match the index up with the new appended data, and save it back out. Or maybe there is an easier way. Any Help is appreciated.
If you aren't strictly looking for an excel file, then get the output as csv file and just copy the csv to a new excel file
df.to_csv('filepath', mode='a', index = False, header=None)
mode = 'a'
a means append
This is a roundabout way but works neat!
A helper function for appending DataFrame to existing Excel file:
Usage examples...
Old answer: it allows you to write a several DataFrames to a new Excel file.
You can use
openpyxl
engine in conjunction withstartrow
parameter:c:/temp/test.xlsx:
PS you may also want to specify
header=None
if you don't want to duplicate column names...UPDATE: you may also want to check this solution