Read all worksheets in an Excel workbook into an R

2020-01-23 10:18发布

I understand that XLConnect can be used to read an Excel worksheet into R. For example, this would read the first worksheet in a workbook called test.xls into R.

library(XLConnect)
readWorksheetFromFile('test.xls', sheet = 1)

I have an Excel Workbook with multiple worksheets.

How can all worksheets in a workbook be imported into a list in R where each element of the list is a data.frame for a given sheet, and where the name of each element corresponds to the name of the worksheet in Excel?

10条回答
爷、活的狠高调
2楼-- · 2020-01-23 10:56

From official readxl (tidyverse) documentation (changing first line):

path <- "data/datasets.xlsx"

path %>% 
  excel_sheets() %>% 
  set_names() %>% 
  map(read_excel, path = path)

Details at: http://readxl.tidyverse.org/articles/articles/readxl-workflows.html#iterate-over-multiple-worksheets-in-a-workbook

查看更多
我只想做你的唯一
3楼-- · 2020-01-23 10:56

excel.link will do the job.

I actually found it easier to use compared to XLConnect (not that either package is that difficult to use). Learning curve for both was about 5 minutes.

As an aside, you can easily find all R packages that mention the word "Excel" by browsing to http://cran.r-project.org/web/packages/available_packages_by_name.html

查看更多
ら.Afraid
4楼-- · 2020-01-23 11:00

I tried the above and had issues with the amount of data that my 20MB Excel I needed to convert consisted of; therefore the above did not work for me.

After more research I stumbled upon openxlsx and this one finally did the trick (and fast) Importing a big xlsx file into R?

https://cran.r-project.org/web/packages/openxlsx/openxlsx.pdf

查看更多
萌系小妹纸
5楼-- · 2020-01-23 11:01

Updated answer using readxl (22nd June 2015)

Since posting this question the readxl package has been released. It supports both xls and xlsx format. Importantly, in contrast to other excel import packages, it works on Windows, Mac, and Linux without requiring installation of additional software.

So a function for importing all sheets in an Excel workbook would be:

library(readxl)    
read_excel_allsheets <- function(filename, tibble = FALSE) {
    # I prefer straight data.frames
    # but if you like tidyverse tibbles (the default with read_excel)
    # then just pass tibble = TRUE
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X))
    if(!tibble) x <- lapply(x, as.data.frame)
    names(x) <- sheets
    x
}

This could be called with:

mysheets <- read_excel_allsheets("foo.xls")

Old Answer

Building on the answer provided by @mnel, here is a simple function that takes an Excel file as an argument and returns each sheet as a data.frame in a named list.

library(XLConnect)

importWorksheets <- function(filename) {
    # filename: name of Excel file
    workbook <- loadWorkbook(filename)
    sheet_names <- getSheets(workbook)
    names(sheet_names) <- sheet_names
    sheet_list <- lapply(sheet_names, function(.sheet){
        readWorksheet(object=workbook, .sheet)})
}

Thus, it could be called with:

importWorksheets('test.xls')
查看更多
Summer. ? 凉城
6楼-- · 2020-01-23 11:01

Adding to Paul's answer. The sheets can also be concatenated using something like this:

data = path %>% 
excel_sheets() %>% 
set_names() %>% 
map_df(~ read_excel(path = path, sheet = .x), .id = "Sheet")

Libraries needed:

if(!require(pacman))install.packages("pacman")
pacman::p_load("tidyverse","readxl","purrr")
查看更多
可以哭但决不认输i
7楼-- · 2020-01-23 11:06

Note that most of XLConnect's functions are already vectorized. This means that you can read in all worksheets with one function call without having to do explicit vectorization:

require(XLConnect)
wb <- loadWorkbook(system.file("demoFiles/mtcars.xlsx", package = "XLConnect"))
lst = readWorksheet(wb, sheet = getSheets(wb))

With XLConnect 0.2-0 lst will already be a named list.

查看更多
登录 后发表回答