i'm trying to automate the login of the UK's data archive service. that website is obviously trustworthy. unfortunately, both RCurl
and httr
break at SSL verification. my web browser doesn't give any sort of warning. i can work around the issue by using ssl.verifypeer = FALSE
in RCurl
but i'd like to understand what's going on?
# breaks
library(httr)
GET( "https://www.esds.ac.uk/secure/UKDSRegister_start.asp" )
# breaks
library(RCurl)
cert <- system.file("CurlSSL/cacert.pem", package = "RCurl")
getURL("https://www.esds.ac.uk/secure/UKDSRegister_start.asp",cainfo = cert)
# works
library(RCurl)
getURL(
"https://www.esds.ac.uk/secure/UKDSRegister_start.asp" ,
.opts = list(ssl.verifypeer = FALSE)
) # note: use list(ssl.verifypeer = FALSE,followlocation=TRUE) to see content
TL;DR
Get the TERENA SSL CA PEM file from TERENA's repository of trusted certificates and use this file as your
cainfo
parameter.EDIT: You might need to add two lines to the beginning of that file. The code works for me using the following
TERENA.pem
file:Why?
The
GET
method ofhttr
usesRCurl::curlPerform
internally, as doesRCurl::getURL
, so the observed behavior is not surprising. Thecurl
command-line tools with the "verbose" switch-v
gives the following additional hints:The link in the above error message contains, at enumeration item 3, instruction on obtaining the server's certificate:
To me, this reads as if the certificate is not trusted. A quick search for "terena ssl root certificate" found this website of the University of Helsinki which reads:
This site also contains a link to the certificate repository.