-->

Elegant way to check for missing packages and inst

2020-01-22 13:01发布

问题:

I seem to be sharing a lot of code with coauthors these days. Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.

Is there an elegant way to call installed.packages(), compare that to the ones I am loading and install if missing?

回答1:

Yes. If you have your list of packages, compare it to the output from installed.packages()[,"Package"] and install the missing packages. Something like this:

list.of.packages <- c("ggplot2", "Rcpp")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

Otherwise:

If you put your code in a package and make them dependencies, then they will automatically be installed when you install your package.



回答2:

Dason K. and I have the pacman package that can do this nicely. The function p_load in the package does this. The first line is just to ensure that pacman is installed.

if (!require("pacman")) install.packages("pacman")
pacman::p_load(package1, package2, package_n)


回答3:

You can just use the return value of require:

if(!require(somepackage)){
    install.packages("somepackage")
    library(somepackage)
}

I use library after the install because it will throw an exception if the install wasn't successful or the package can't be loaded for some other reason. You make this more robust and reuseable:

dynamic_require <- function(package){
  if(eval(parse(text=paste("require(",package,")")))) return True

  install.packages(package)
  return eval(parse(text=paste("require(",package,")")))
}

The downside to this method is that you have to pass the package name in quotes, which you don't do for the real require.



回答4:

This solution will take a character vector of package names and attempt to load them, or install them if loading fails. It relies on the return behaviour of require to do this because...

require returns (invisibly) a logical indicating whether the required package is available

Therefore we can simply see if we were able to load the required package and if not, install it with dependencies. So given a character vector of packages you wish to load...

foo <- function(x){
  for( i in x ){
    #  require returns TRUE invisibly if it was able to load package
    if( ! require( i , character.only = TRUE ) ){
      #  If package was not able to be loaded then re-install
      install.packages( i , dependencies = TRUE )
      #  Load package after installing
      require( i , character.only = TRUE )
    }
  }
}

#  Then try/install packages...
foo( c("ggplot2" , "reshape2" , "data.table" ) )


回答5:

if (!require('ggplot2')) install.packages('ggplot2'); library('ggplot2')

"ggplot2" is the package. It checks to see if the package is installed, if it is not it installs it. It then loads the package regardless of which branch it took.



回答6:

A lot of the answers above (and on duplicates of this question) rely on installed.packages which is bad form. From the documentation:

This can be slow when thousands of packages are installed, so do not use this to find out if a named package is installed (use system.file or find.package) nor to find out if a package is usable (call require and check the return value) nor to find details of a small number of packages (use packageDescription). It needs to read several files per installed package, which will be slow on Windows and on some network-mounted file systems.

So, a better approach is to attempt to load the package using require and and install if loading fails (require will return FALSE if it isn't found). I prefer this implementation:

using<-function(...) {
    libs<-unlist(list(...))
    req<-unlist(lapply(libs,require,character.only=TRUE))
    need<-libs[req==FALSE]
    if(length(need)>0){ 
        install.packages(need)
        lapply(need,require,character.only=TRUE)
    }
}

which can be used like this:

using("RCurl","ggplot2","jsonlite","magrittr")

This way it loads all the packages, then goes back and installs all the missing packages (which if you want, is a handy place to insert a prompt to ask if the user wants to install packages). Instead of calling install.packages separately for each package it passes the whole vector of uninstalled packages just once.

Here's the same function but with a windows dialog that asks if the user wants to install the missing packages

using<-function(...) {
    libs<-unlist(list(...))
    req<-unlist(lapply(libs,require,character.only=TRUE))
    need<-libs[req==FALSE]
    n<-length(need)
    if(n>0){
        libsmsg<-if(n>2) paste(paste(need[1:(n-1)],collapse=", "),",",sep="") else need[1]
        print(libsmsg)
        if(n>1){
            libsmsg<-paste(libsmsg," and ", need[n],sep="")
        }
        libsmsg<-paste("The following packages could not be found: ",libsmsg,"\n\r\n\rInstall missing packages?",collapse="")
        if(winDialog(type = c("yesno"), libsmsg)=="YES"){       
            install.packages(need)
            lapply(need,require,character.only=TRUE)
        }
    }
}


回答7:

Although the answer of Shane is really good, for one of my project I needed to remove the ouput messages, warnings and install packages automagically. I have finally managed to get this script:

InstalledPackage <- function(package) 
{
    available <- suppressMessages(suppressWarnings(sapply(package, require, quietly = TRUE, character.only = TRUE, warn.conflicts = FALSE)))
    missing <- package[!available]
    if (length(missing) > 0) return(FALSE)
    return(TRUE)
}

CRANChoosen <- function()
{
    return(getOption("repos")["CRAN"] != "@CRAN@")
}

UsePackage <- function(package, defaultCRANmirror = "http://cran.at.r-project.org") 
{
    if(!InstalledPackage(package))
    {
        if(!CRANChoosen())
        {       
            chooseCRANmirror()
            if(!CRANChoosen())
            {
                options(repos = c(CRAN = defaultCRANmirror))
            }
        }

        suppressMessages(suppressWarnings(install.packages(package)))
        if(!InstalledPackage(package)) return(FALSE)
    }
    return(TRUE)
}

Use:

libraries <- c("ReadImages", "ggplot2")
for(library in libraries) 
{ 
    if(!UsePackage(library))
    {
        stop("Error!", library)
    }
}


回答8:

# List of packages for session
.packages = c("ggplot2", "plyr", "rms")

# Install CRAN packages (if not already installed)
.inst <- .packages %in% installed.packages()
if(length(.packages[!.inst]) > 0) install.packages(.packages[!.inst])

# Load packages into session 
lapply(.packages, require, character.only=TRUE)


回答9:

This is the purpose of the rbundler package: to provide a way to control the packages that are installed for a specific project. Right now the package works with the devtools functionality to install packages to your project's directory. The functionality is similar to Ruby's bundler.

If your project is a package (recommended) then all you have to do is load rbundler and bundle the packages. The bundle function will look at your package's DESCRIPTION file to determine which packages to bundle.

library(rbundler)
bundle('.', repos="http://cran.us.r-project.org")

Now the packages will be installed in the .Rbundle directory.

If your project isn't a package, then you can fake it by creating a DESCRIPTION file in your project's root directory with a Depends field that lists the packages that you want installed (with optional version information):

Depends: ggplot2 (>= 0.9.2), arm, glmnet

Here's the github repo for the project if you're interested in contributing: rbundler.



回答10:

Sure.

You need to compare 'installed packages' with 'desired packages'. That's very close to what I do with CRANberries as I need to compare 'stored known packages' with 'currently known packages' to determine new and/or updated packages.

So do something like

AP <- available.packages(contrib.url(repos[i,"url"]))   # available t repos[i]

to get all known packages, simular call for currently installed packages and compare that to a given set of target packages.



回答11:

The following simple function works like a charm:

  usePackage<-function(p){
      # load a package if installed, else load after installation.
      # Args:
      #   p: package name in quotes

      if (!is.element(p, installed.packages()[,1])){
        print(paste('Package:',p,'Not found, Installing Now...'))
        install.packages(p, dep = TRUE)}
      print(paste('Loading Package :',p))
      require(p, character.only = TRUE)  
    }

(not mine, found this on the web some time back and had been using it since then. not sure of the original source)



回答12:

I use following function to install package if require("<package>") exits with package not found error. It will query both - CRAN and Bioconductor repositories for missing package.

Adapted from the original work by Joshua Wiley, http://r.789695.n4.nabble.com/Install-package-automatically-if-not-there-td2267532.html

install.packages.auto <- function(x) { 
  x <- as.character(substitute(x)) 
  if(isTRUE(x %in% .packages(all.available=TRUE))) { 
    eval(parse(text = sprintf("require(\"%s\")", x)))
  } else { 
    #update.packages(ask= FALSE) #update installed packages.
    eval(parse(text = sprintf("install.packages(\"%s\", dependencies = TRUE)", x)))
  }
  if(isTRUE(x %in% .packages(all.available=TRUE))) { 
    eval(parse(text = sprintf("require(\"%s\")", x)))
  } else {
    source("http://bioconductor.org/biocLite.R")
    #biocLite(character(), ask=FALSE) #update installed packages.
    eval(parse(text = sprintf("biocLite(\"%s\")", x)))
    eval(parse(text = sprintf("require(\"%s\")", x)))
  }
}

Example:

install.packages.auto(qvalue) # from bioconductor
install.packages.auto(rNMF) # from CRAN

PS: update.packages(ask = FALSE) & biocLite(character(), ask=FALSE) will update all installed packages on the system. This can take a long time and consider it as a full R upgrade which may not be warranted all the time!



回答13:

You can simply use the setdiff function to get the packages that aren't installed and then install them. In the sample below, we check if the ggplot2 and Rcpp packages are installed before installing them.

unavailable <- setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages()))
install.packages(unavailable)

In one line, the above can be written as:

install.packages(setdiff(c("ggplot2", "Rcpp"), rownames(installed.packages())))


回答14:

Use packrat so that the shared libraries are exactly the same and not changing other's environment.

In terms of elegance and best practice I think you're fundamentally going about it the wrong way. The package packrat was designed for these issues. It is developed by RStudio by Hadley Wickham. Instead of them having to install dependencies and possibly mess up someone's environment system, packrat uses its own directory and installs all the dependencies for your programs in there and doesn't touch someone's environment.

Packrat is a dependency management system for R.

R package dependencies can be frustrating. Have you ever had to use trial-and-error to figure out what R packages you need to install to make someone else’s code work–and then been left with those packages globally installed forever, because now you’re not sure whether you need them? Have you ever updated a package to get code in one of your projects to work, only to find that the updated package makes code in another project stop working?

We built packrat to solve these problems. Use packrat to make your R projects more:

  • Isolated: Installing a new or updated package for one project won’t break your other projects, and vice versa. That’s because packrat gives each project its own private package library.
  • Portable: Easily transport your projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
  • Reproducible: Packrat records the exact package versions you depend on, and ensures those exact versions are the ones that get installed wherever you go.

https://rstudio.github.io/packrat/



回答15:

I have implemented the function to install and load required R packages silently. Hope might help. Here is the code:

# Function to Install and Load R Packages
Install_And_Load <- function(Required_Packages)
{
    Remaining_Packages <- Required_Packages[!(Required_Packages %in% installed.packages()[,"Package"])];

    if(length(Remaining_Packages)) 
    {
        install.packages(Remaining_Packages);
    }
    for(package_name in Required_Packages)
    {
        library(package_name,character.only=TRUE,quietly=TRUE);
    }
}

# Specify the list of required packages to be installed and load    
Required_Packages=c("ggplot2", "Rcpp");

# Call the Function
Install_And_Load(Required_Packages);


回答16:

The coming version of RStudio (1.2), already available as a preview, will include a feature to detect missing packages in library() and require() calls, and prompt the user to install them:

Detect missing R packages

Many R scripts open with calls to library() and require() to load the packages they need in order to execute. If you open an R script that references packages that you don’t have installed, RStudio will now offer to install all the needed packages in a single click. No more typing install.packages() repeatedly until the errors go away!
https://blog.rstudio.com/2018/11/19/rstudio-1-2-preview-the-little-things/

This seems to address the original concern of OP particularly well:

Many of them are novice/intermediate R users and don't realize that they have to install packages they don't already have.



回答17:

Regarding your main objective " to install libraries they don't already have. " and regardless of using " instllaed.packages() ". The following function mask the original function of require. It tries to load and check the named package "x" , if it's not installed, install it directly including dependencies; and lastly load it normaly. you rename the function name from 'require' to 'library' to maintain integrity . The only limitation is packages names should be quoted.

require <- function(x) { 
  if (!base::require(x, character.only = TRUE)) {
  install.packages(x, dep = TRUE) ; 
  base::require(x, character.only = TRUE)
  } 
}

So you can load and installed package the old fashion way of R. require ("ggplot2") require ("Rcpp")



回答18:

Quite basic one.

pkgs = c("pacman","data.table")
if(length(new.pkgs <- setdiff(pkgs, rownames(installed.packages())))) install.packages(new.pkgs)


回答19:

Thought I'd contribute the one I use:

testin <- function(package){if (!package %in% installed.packages())    
install.packages(package)}
testin("packagename")


回答20:

Using lapply family and anonymous function approach you may:

  1. Try to attach all listed packages.
  2. Install missing only (using || lazy evaluation).
  3. Attempt to attach again those were missing in step 1 and installed in step 2.
  4. Print each package final load status (TRUE / FALSE).

    req <- substitute(require(x, character.only = TRUE))
    lbs <- c("plyr", "psych", "tm")
    sapply(lbs, function(x) eval(req) || {install.packages(x); eval(req)})
    
    plyr psych    tm 
    TRUE  TRUE  TRUE 
    


回答21:

I use the following which will check if package is installed and if dependencies are updated, then loads the package.

p<-c('ggplot2','Rcpp')
install_package<-function(pack)
{if(!(pack %in% row.names(installed.packages())))
{
  update.packages(ask=F)
  install.packages(pack,dependencies=T)
}
 require(pack,character.only=TRUE)
}
for(pack in p) {install_package(pack)}

completeFun <- function(data, desiredCols) {
  completeVec <- complete.cases(data[, desiredCols])
  return(data[completeVec, ])
}


回答22:

Here's my code for it:

packages <- c("dplyr", "gridBase", "gridExtra")
package_loader <- function(x){
    for (i in 1:length(x)){
        if (!identical((x[i], installed.packages()[x[i],1])){
            install.packages(x[i], dep = TRUE)
        } else {
            require(x[i], character.only = TRUE)
        }
    }
}
package_loader(packages)


回答23:

 48 lapply_install_and_load <- function (package1, ...)
 49 {
 50     #
 51     # convert arguments to vector
 52     #
 53     packages <- c(package1, ...)
 54     #
 55     # check if loaded and installed
 56     #
 57     loaded        <- packages %in% (.packages())
 58     names(loaded) <- packages
 59     #
 60     installed        <- packages %in% rownames(installed.packages())
 61     names(installed) <- packages
 62     #
 63     # start loop to determine if each package is installed
 64     #
 65     load_it <- function (p, loaded, installed)
 66     {
 67         if (loaded[p])
 68         {
 69             print(paste(p, "loaded"))
 70         }
 71         else
 72         {
 73             print(paste(p, "not loaded"))
 74             if (installed[p])
 75             {
 76                 print(paste(p, "installed"))
 77                 do.call("library", list(p))
 78             }
 79             else
 80             {
 81                 print(paste(p, "not installed"))
 82                 install.packages(p)
 83                 do.call("library", list(p))
 84             }
 85         }
 86     }
 87     #
 88     lapply(packages, load_it, loaded, installed)
 89 }


回答24:

library <- function(x){
  x = toString(substitute(x))
if(!require(x,character.only=TRUE)){
  install.packages(x)
  base::library(x,character.only=TRUE)
}}

This works with unquoted package names and is fairly elegant (cf. GeoObserver's answer)



回答25:

source("https://bioconductor.org/biocLite.R")
if (!require("ggsci")) biocLite("ggsci")


回答26:

In my case, I wanted a one liner that I could run from the commandline (actually via a Makefile). Here is an example installing "VGAM" and "feather" if they are not already installed:

R -e 'for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")'

From within R it would just be:

for (p in c("VGAM", "feather")) if (!require(p, character.only=TRUE)) install.packages(p, repos="http://cran.us.r-project.org")

There is nothing here beyond the previous solutions except that:

  • I keep it to a single line
  • I hard code the repos parameter (to avoid any popups asking about the mirror to use)
  • I don't bother to define a function to be used elsewhere

Also note the important character.only=TRUE (without it, the require would try to load the package p).



回答27:

  packages_installed <- function(pkg_list){
        pkgs <- unlist(pkg_list)
        req <- unlist(lapply(pkgs, require, character.only = TRUE))
        not_installed <- pkgs[req == FALSE]
        lapply(not_installed, install.packages, 
               repos = "http://cran.r-project.org")# add lib.loc if needed
        lapply(pkgs, library, character.only = TRUE)
}


标签: r packages r-faq