Why (or when) is Rscript (or littler) better than

2019-01-09 00:18发布

问题:

I am automating some webscraping with R in cron and sometimes I use R CMD BATCH and sometimes I use Rscript.

To decide which one to use I mainly focus if I want the .Rout file or not.

But reading the answers to some questions here in SO (like this or this) it seems that Rscript is preferred to R CMD BATCH.

So my questions are:

  • Besides the fact that the syntax is a little different and R CMD BATCH saves an .Rout file while Rscript does not, what are the main differences between the two of them?

  • When should I prefer one over another? More specifically, in the cron job above mentioned, is one of them preferred?

  • I have not used yet littler, how is it different from both Rscript and R CMD BATCH?

回答1:

R CMD BATCH is all we had years ago. It makes i/o very hard and leaves files behind.

Things got better, first with littler and then too with Rscript. Both can be used for 'shebang' lines such as

 #!/usr/bin/r

 #!/usr/bin/Rscript

and both can be used with packages like getopt and optparse --- allowing you to write proper R scripts that can act as commands. If have dozens of them, starting with simple ones like this which I can call as install.r pkga pkgb pkgc and which will install all three and their dependencies) for me from the command-line without hogging the R prompt:

#!/usr/bin/env r       
#
# a simple example to install one or more packages 

if (is.null(argv) | length(argv)<1) {
  cat("Usage: installr.r pkg1 [pkg2 pkg3 ...]\n")
  q()
}

## adjust as necessary, see help('download.packages') 
repos <- "http://cran.rstudio.com"

## this makes sense on Debian where no packages touch /usr/local 
lib.loc <- "/usr/local/lib/R/site-library"

install.packages(argv, lib.loc, repos)

And just like Karl, I have cronjobs calling similar R scripts.

Edit on 2015-11-04: As of last week, littler is now also on CRAN.



回答2:

From what I understand:

R CMD BATCH:

  • echo the input statements
  • can not output to stdout

Rscript:

  • does NOT echo
  • output to stdout
  • can be used in one-liner (i.e. with no input file)

littler:

  • all that Rscript does
  • can read commands from stdin (useful for pipelining)
  • faster startup time
  • load the methods package

In practice I use Rscript to run scripts, in command-line or in crons.