Curl to grab remote filename after following locat

2019-01-30 02:04发布

问题:

When downloading a file using curl, how would I follow a link location and use that for the output filename (without knowing the remote filename in advance)?

For example, if one clicks on the link below, you would download a filenamed "pythoncomplete.vim." However using curl's -O and -L options, the filename is simply the original remote-name, a clumsy "download_script.php?src_id=10872."

curl -O -L http://www.vim.org/scripts/download_script.php?src_id=10872

In order to download the file with the correct filename you would have to know the name of the file in advance:

curl -o pythoncomplete.vim -L http://www.vim.org/scripts/download_script.php?src_id=10872

It would be excellent if you could download the file without knowing the name in advance, and if not, is there another way to quickly pull down a redirected file via command line?

回答1:

If you have a recent version of curl (7.21.2 or later), see @jmanning2k's answer.

I you have an older version of curl (like 7.19.7 which came with Snow Leopard), do two requests: a HEAD to get the file name from response header, then a GET:

url="http://www.vim.org/scripts/download_script.php?src_id=10872"
filename=$(curl -sI  $url | grep -o -E 'filename=.*$' | sed -e 's/filename=//')
curl -o $filename -L $url


回答2:

The remote side sends the filename using the Content-Disposition header.

curl 7.21.2 or newer does this automatically if you specify --remote-header-name / -J.

curl -O -J -L $url


回答3:

I wanted to comment to jmanning2k's answer but as a new user I can't, so I tried to edit his post which is allowed but the edit was rejected saying it was supposed to be a comment. sigh

Anyway, see this as a comment to his answer thanks.

This seems to only work if the header looks like filename=pythoncomplete.vim as in the example, but some sites send a header that looks like filename*=UTF-8' 'filename.zip' that one isn't recognized by curl 7.28.0



回答4:

If you can use wget instead of curl:

wget --content-disposition $url


回答5:

I wanted a solution that worked on both older and newer Macs, and the legacy code David provided for Snow Leopard did not behave well under Mavericks. Here's a function I created based on David's code:

function getUriFilename() {
    header="$(curl -sI "$1" | tr -d '\r')"

    filename="$(echo "$header" | grep -o -E 'filename=.*$')"
    if [[ -n "$filename" ]]; then
        echo "${filename#filename=}"
        return
    fi

    filename="$(echo "$header" | grep -o -E 'Location:.*$')"
    if [[ -n "$filename" ]]; then
        basename "${filename#Location\:}"
        return
    fi

    return 1
}

With this defined, you can run:

url="http://www.vim.org/scripts/download_script.php?src_id=10872"
filename="$(getUriFilename $url)"
curl -L $url -o "$filename"


回答6:

Please note that certain malconfigured webservers will serve the name using "Filename" as key, where RFC2183 specifies it should be "filename". curl only handles the latter case.



回答7:

An example using the answer above for Apache Archiva artifact repository to pull latest version. The curl returns the Location line and the filename is at the end of the line. Need to remove the CR at end of file name.

url="http://archiva:8080/restServices/archivaServices/searchService/artifact?g=com.imgur.backup&a=snapshot-s3-util&v=LATEST"
filename=$(curl --silent -sI -u user:password $url | grep Location | awk -F\/ '{print $NF}' | sed 's/\r$//')
curl --silent -o $filename -L -u user:password $url