I'm trying to use download.file to get some webpages including embedded images, etc. I think using wget
it's the equivalent of the -p -k
options, but I can't see how to do this...
if I do:
download.file("http://guardian.co.uk","test.html")
That obviously works, but I get this error:
Warning messages:
1: running command 'wget -p -k "http://guardian.co.uk" -O "test.html"' had status 1
2: In download.file("http://guardian.co.uk", "test.html", method = "wget", :
download had nonzero exit status
When I run:
download.file("http://guardian.co.uk","test.html", method = "wget", extra = "-p -k") #no recursion (-r), but get pre-requisites, and (-k) convert for local viewing
I've done Sys.which("wget")
& the path is set (and I'm not trying to access https which I think can cause issues).
Once I've done this I actually want to put it into a loop where I download a set of urls (& their embedded content) to create a single html output...
Easy solution, just use
system
to callwget
directly:I think the issue is that passing an output file ('test.html') means
-O
option specified, so you can't also invoke-r -k
whereas callingwget
directly means it saves the files separately.