wget download with multiple simultaneous connectio

2019-01-12 14:01发布

I'm using wget to download website content, but wget downloads the files one by one.

How can I make wget download using 4 simultaneous connections?

标签: download wget
14条回答
冷血范
2楼-- · 2019-01-12 14:30

use xargs to make wget working in multiple file in parallel

#!/bin/bash

mywget()
{
    wget "$1"
}

export -f mywget

# run wget in parallel using 8 thread/connection
xargs -P 8 -n 1 -I {} bash -c "mywget '{}'" < list_urls.txt

Aria2 options, The right way working with file smaller than 20mb

aria2c -k 2M -x 10 -s 10 [url]

-k 2M split file into 2mb chunk

-k or --min-split-size has default value of 20mb, if you not set this option and file under 20mb it will only run in single connection no matter what value of -x or -s

查看更多
淡お忘
3楼-- · 2019-01-12 14:31

Wget does not support multiple socket connections in order to speed up download of files.

I think we can do a bit better than gmarian answer.

The correct way is to use aria2.

aria2c -x 16 -s 16 [url]
#          |    |
#          |    |
#          |    |
#          ---------> the number of connections here
查看更多
够拽才男人
4楼-- · 2019-01-12 14:31

Since GNU parallel was not mentioned yet, let me give another way:

cat url.list | parallel -j 8 wget -O {#}.html {}
查看更多
相关推荐>>
5楼-- · 2019-01-12 14:33

A new (but yet not released) tool is Mget. It has already many options known from Wget and comes with a library that allows you to easily embed (recursive) downloading into your own application.

To answer your question:

mget --num-threads=4 [url]

UPDATE

Mget is now developed as Wget2 with many bugs fixed and more features (e.g. HTTP/2 support).

--num-threads is now --max-threads.

查看更多
Evening l夕情丶
6楼-- · 2019-01-12 14:34

As other posters have mentioned, I'd suggest you have a look at aria2. From the Ubuntu man page for version 1.16.1:

aria2 is a utility for downloading files. The supported protocols are HTTP(S), FTP, BitTorrent, and Metalink. aria2 can download a file from multiple sources/protocols and tries to utilize your maximum download bandwidth. It supports downloading a file from HTTP(S)/FTP and BitTorrent at the same time, while the data downloaded from HTTP(S)/FTP is uploaded to the BitTorrent swarm. Using Metalink's chunk checksums, aria2 automatically validates chunks of data while downloading a file like BitTorrent.

You can use the -x flag to specify the maximum number of connections per server (default: 1):

aria2c -x 16 [url] 

If the same file is available from multiple locations, you can choose to download from all of them. Use the -j flag to specify the maximum number of parallel downloads for every static URI (default: 5).

aria2c -j 5 [url] [url2]

Have a look at http://aria2.sourceforge.net/ for more information. For usage information, the man page is really descriptive and has a section on the bottom with usage examples. An online version can be found at http://aria2.sourceforge.net/manual/en/html/README.html.

查看更多
Melony?
7楼-- · 2019-01-12 14:35

Another program that can do this is axel.

axel -n <NUMBER_OF_CONNECTIONS> URL

Ubuntu man page.

查看更多
登录 后发表回答