New member here. Trying to download a large number of files from a website in R (but open to suggestions as well, such as wget.)
From this post, I understand I must create a vector with the desired URLs. My initial problem is to write this vector, since I have 27 states and 34 agencies within each state. I must download one file for each agency for all states. Whereas the state codes are always two characters, the agency codes are 2 to 7 characters long. The URLs would look like this:
http://website.gov/xx_yyyyyyy.zip
where xx
is the state code and yyyyyyy
the agency code, between 2 and 7 characters long. I am lost as to how to build one such loop.
I assume I can then download this url list with the following function:
for(i in 1:length(url)){
download.file(urls, destinations, mode="wb")}
Does that make sense?
(Disclaimer: an earlier version of this post was uploaded earlier but incomplete. My mistake, sorry!)
This should do the job:
Then loop through the
URLs
vector to pull the zip files. It will be faster if you use an apply function.This will download them in batches and take advantage of the speedier simultaneous downloading capabilities of
download.file()
if thelibcurl
option is available on your installation of R:If all your agency codes are the same within each state code you could use the below to create your vector of urls to loop through. (You will also need a vector of destinations the same size).
You can then try looping through each file something like this:
This is also another way of looping through items apply functions will apply a function to every item in a list or vector.