I am automating some webscraping with R
in cron
and sometimes I use R CMD BATCH
and sometimes I use Rscript
.
To decide which one to use I mainly focus if I want the .Rout
file or not.
But reading the answers to some questions here in SO (like this or this) it seems that Rscript
is preferred to R CMD BATCH
.
So my questions are:
Besides the fact that the syntax is a little different and R CMD BATCH
saves an .Rout
file while Rscript
does not, what are the main differences between the two of them?
When should I prefer one over another? More specifically, in the cron
job above mentioned, is one of them preferred?
I have not used yet littler
, how is it different from both Rscript
and R CMD BATCH
?
R CMD BATCH
is all we had years ago. It makes i/o very hard and leaves files behind.
Things got better, first with littler and then too with Rscript. Both can be used for 'shebang' lines such as
#!/usr/bin/r
#!/usr/bin/Rscript
and both can be used with packages like getopt and optparse --- allowing you to write proper R scripts that can act as commands. If have dozens of them, starting with simple ones like this which I can call as install.r pkga pkgb pkgc
and which will install all three and their dependencies) for me from the command-line without hogging the R prompt:
#!/usr/bin/env r
#
# a simple example to install one or more packages
if (is.null(argv) | length(argv)<1) {
cat("Usage: installr.r pkg1 [pkg2 pkg3 ...]\n")
q()
}
## adjust as necessary, see help('download.packages')
repos <- "http://cran.rstudio.com"
## this makes sense on Debian where no packages touch /usr/local
lib.loc <- "/usr/local/lib/R/site-library"
install.packages(argv, lib.loc, repos)
And just like Karl, I have cronjobs calling similar R scripts.
Edit on 2015-11-04: As of last week, littler is now also on CRAN.
From what I understand:
R CMD BATCH:
- echo the input statements
- can not output to stdout
Rscript:
- does NOT echo
- output to stdout
- can be used in one-liner (i.e. with no input file)
littler:
- all that Rscript does
- can read commands from stdin (useful for pipelining)
- faster startup time
- load the methods package
In practice I use Rscript to run scripts, in command-line or in crons.