How to get remote file size from a shell script?

2019-01-30 17:39发布

Is there a way to get the size of a remote file like

http://api.twitter.com/1/statuses/public_timeline.json

in shell script?

11条回答
Summer. ? 凉城
2楼-- · 2019-01-30 18:27

You can download the file and get its size. But we can do better.

Use curl to get only the response header using the -I option.

In the response header look for Content-Length: which will be followed by the size of the file in bytes.

$ URL="http://api.twitter.com/1/statuses/public_timeline.json"
$ curl -sI $URL | grep -i Content-Length
Content-Length: 134

To get the size use a filter to extract the numeric part from the output above:

$ curl -sI $URL | grep -i Content-Length | awk '{print $2}'
134
查看更多
姐就是有狂的资本
3楼-- · 2019-01-30 18:31

This will show you a detailed info about the ongoing download

you just need to specify an URL like below example.

$ curl -O -w 'We downloaded %{size_download} bytes\n' 
https://cmake.org/files/v3.8/cmake-3.8.2.tar.gz

output

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 7328k  100 7328k    0     0   244k      0  0:00:29  0:00:29 --:--:--  365k
We downloaded 7504706 bytes

For automated purposes you'll just need to add the command to your script file.

查看更多
狗以群分
4楼-- · 2019-01-30 18:36

The preceding answers won't work when there are redirections. For example, if one wants the size of the debian iso DVD, he must use the --location option, otherwise, the reported size may be that of the 302 Moved Temporarily answer body, not that of the real file.
Suppose you have the following url:

$ url=http://cdimage.debian.org/debian-cd/8.1.0/amd64/iso-dvd/debian-8.1.0-amd64-DVD-1.iso

With curl, you could obtain:

$ curl --head --location ${url}
HTTP/1.0 302 Moved Temporarily
...
Content-Type: text/html; charset=iso-8859-1
...

HTTP/1.0 200 OK
...
Content-Length: 3994091520
...
Content-Type: application/x-iso9660-image
...

That's why I prefer using HEAD, which is an alias to the lwp-request command from the libwww-perl package (on debian). Another advantages it has is that it strips the extra \r characters, which eases subsequent string processing.

So to retrieve the size of the debian iso DVD, one could do for example:

$ size=$(HEAD ${url})
$ size=${size##*Content-Length: }
$ size=${size%%[[:space:]]*}

Please note that:

  • this method will require launching only one process
  • it will work only with bash, because of the special expansion syntax used

For other shells, you may have to resort to sed, awk, grep et al..

查看更多
甜甜的少女心
5楼-- · 2019-01-30 18:40

I think the easiest way to do this would be to:

  1. use cURL to run in silent mode -s,

  2. pull only the headers -I (so as to avoid downloading the whole file)

  3. then do a case insensitive grep -i

  4. and return the second arg using awk $2.

  5. output is returned as bytes

Examples:

curl -sI http://api.twitter.com/1/statuses/public_timeline.json | grep -i content-length | awk '{print $2}'

//output: 52

or

curl -sI https://code.jquery.com/jquery-3.1.1.min.js | grep -i content-length | awk '{print $2}'

//output: 86709

or

curl -sI http://download.thinkbroadband.com/1GB.zip | grep -i content-length | awk '{print $2}'

//output: 1073741824

Show as Kilobytes/Megabytes

If you would like to show the size in Kilobytes then change the awk to:

awk '{print $2/1024}'

or Megabytes

awk '{print $2/1024/1024}'
查看更多
该账号已被封号
6楼-- · 2019-01-30 18:40

I have a shell function, based on codaddict's answer, which gives a remote file's size in a human-readable format thusly:

remote_file_size () {
  printf "%q" "$*"           |
    xargs curl -sI           |
    grep Content-Length      |
    awk '{print $2}'         |
    tr -d '\040\011\012\015' |
    gnumfmt --to=iec-i --suffix=B # the `g' prefix on `numfmt' is only for systems
  # ^                             # that lack the GNU coreutils by default, i.e.,
  # |                             # non-Linux systems
  # |
  # |                             # in other words, if you're on Linux, remove this
  # |                             # letter `g'; if you're on BSD or Mac, install the GNU coreutils
} # |                                        |
  # +----------------------------------------+
查看更多
登录 后发表回答