download.file in R including pre-requisites

2019-08-31 03:17发布

问题:

I'm trying to use download.file to get some webpages including embedded images, etc. I think using wget it's the equivalent of the -p -k options, but I can't see how to do this...

if I do:

download.file("http://guardian.co.uk","test.html")

That obviously works, but I get this error:

Warning messages:
1: running command 'wget -p -k "http://guardian.co.uk" -O "test.html"' had status 1 
2: In download.file("http://guardian.co.uk", "test.html", method = "wget",  :
  download had nonzero exit status

When I run:

download.file("http://guardian.co.uk","test.html", method = "wget", extra = "-p -k") #no recursion (-r), but get pre-requisites, and (-k) convert for local viewing

I've done Sys.which("wget") & the path is set (and I'm not trying to access https which I think can cause issues).

Once I've done this I actually want to put it into a loop where I download a set of urls (& their embedded content) to create a single html output...

回答1:

Easy solution, just use system to call wget directly:

system("wget http://guardian.co.uk -p -k")

I think the issue is that passing an output file ('test.html') means -O option specified, so you can't also invoke -r -k whereas calling wget directly means it saves the files separately.



标签: r wget