Why wget ignores query string in the url?

2019-01-25 02:47发布

问题:

I want to use wget to download the following 18 html files:

http://www.ted.com/talks/quick-list?sort=date&order=desc&page=18  
http://www.ted.com/talks/quick-list?sort=date&order=desc&page=17  
...  
http://www.ted.com/talks/quick-list?sort=date&order=desc&page=1

No matter what comes after page=, it always downloads the first page of the listing. Do I have to escape some characters in the urls? How?

回答1:

& is a special character in most shell environments, you can use double quotes to quote the URL to pass the whole thing in as the parameter to wget:

wget "http://www.ted.com/talks/quick-list?sort=date&order=desc&page=18"


回答2:

  1. Store your list of URLs in a file (each URL in a separate line!!):

    echo "http://www.ted.com/talks/quick-list?sort=date&order=desc&page=18 http://www.ted.com/talks/quick-list?sort=date&order=desc&page=17 ... " > wget_filelist.txt

  2. Call wget to retrieve the stuff:

    wget -i wget_filelist.txt