I have to read multiple xlsx file with random names into single dataframe. Structure of each file is same. I have to import specific columns only.
I tried this:
dat <- read.xlsx("FILE.xlsx", sheetIndex=1,
sheetName=NULL, startRow=5,
endRow=NULL, as.data.frame=TRUE,
header=TRUE)
But this is for only one file at a time and I couldn't specify my particular columns. I even tried :
site=list.files(pattern='[.]xls')
but after that loop isn't working. How to do it? Thanks in advance.
I am more familiar with a for loop, which can be a bit more cumbersome.
filelist <- list.files(pattern = "\\.xlsx")
# list all the xlsx files from the directoryconvert back to data.frame
I would read each sheet to a list:
Get file names:
Read files:
You can then access the items in your list with:
Or do the same task to them with:
Turn them into a data frame (where your file column now becomes useful):
For a variation on Wyldsoul's answer, but using a for loop across multiple Excel sheets (between 1 and j) in the same Excel file, and binding with dplyr: