I would like to read online data to R using download.file()
as shown below.
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
download.file(URL, destfile = "./data/data.csv", method="curl")
Someone suggested to me that I add the line setInternet2(TRUE)
, but it still doesn't work.
The error I get is:
Warning messages:
1: running command 'curl "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv" -o "./data/data.csv"' had status 127
2: In download.file(URL, destfile = "./data/data.csv", method = "curl", :
download had nonzero exit status
Appreciate your help.
It might be easiest to try the RCurl package. Install the package and try the following:
If using RCurl you get an SSL error on the GetURL() function then set these options before GetURL(). This will set the CurlSSL settings globally.
The extended code:
Worked for me on Windows 7 64-bit using R3.1.0!
Had exactly the same problem as UseR (original question), I'm also using windows 7. I tried all proposed solutions and they didn't work.
I resolved the problem doing as follows:
Using RStudio instead of R console.
Actualising the version of R (from 3.1.0 to 3.1.1) so that the library RCurl runs OK on it. (I'm using now R3.1.1 32bit although my system is 64bit).
I typed the URL address as https (secure connection) and with
/
instead of backslashes\\
.Setting
method = "auto"
.It works for me now. You should see the message:
Here's an update as of Nov 2014. I find that setting
method='curl'
did the trick for me (whilemethod='auto'
, does not).For example:
127 means command not found
In your case, curl command was not found. Therefore it means, curl was not found.
You need to install/reinstall CURL. That's all. Get latest version for your OS from http://curl.haxx.se/download.html
Close RStudio before installation.
Offering the curl package as an alternative that I found to be reliable when extracting large files from an online database. In a recent project, I had to download 120 files from an online database and found it to half the transfer times and to be much more reliable than download.file.
In this case, rough timing on your URL showed no consistent difference in transfer times. In my application, using curl_download in a script to select and download 120 files from a website decreased my transfer times from 2000 seconds per file to 1000 seconds and increased the reliability from 50% to 2 failures in 120 files. The script is posted in my answer to a question I asked earlier, see .