How to pull all alternative tags of a docker image

2019-07-23 20:11发布

问题:

I administer a gitlab with build pipeline. All components are encapsulated in docker images from the official gitlab maintainer.

Whenever I update - usually once a week - I need to check whether the gitlab/gitlab-runner-helper still works for the current latest version of gitlab. This can only be checked by executing a pipeline. If it does not work, the log tells me exactly what image it needs an I proceed to pull it.

The image in question is also tagged with a latest tag, which I cannot use, due to the hard dependency to the non-volatile tag.

$docker image ls
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
gitlab/gitlab-runner-helper   x86_64-8af42251     1ee5a99eba5f        20 hours ago        43.7MB
gitlab/gitlab-runner-helper   x86_64-latest       1ee5a99eba5f        20 hours ago        43.7MB

To automate my update process, I'd like to know, how I could pull the latest image with all alternative tags?

The man page of docker pull says, there is a --all-tags option, to load any tagged image from the repository, but this cannot be combined with a tag.

回答1:

As far as I know, there is no really efficient or built in way to do this. Instead, you need to query your registry via REST, first for the tag list for that repository:

GET http://<registry>/v2/<repository>/tags/list

Then, for each tag, a manifest:

GET http://<registry>/v2/<repository>/manifests/<tag>

Each manifest will have a hash associated with it, which you should be able to get from the HTTP headers of the response. You may even be able to make a HEAD request for it and avoid the rest of the manifest payload, but I haven't tried this recently.

Now you have a list of tags and manifest hashes, and you just need to find all the tags with hashes that match the latest tag.

This is a little tedious, but it's actually not that bad to script out with curl and jq, especially if you don't need to worry about security.


Script:

#!/bin/sh

TOKEN=`curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:gitlab/gitlab-runner-helper:pull" | jq '.token' | sed 's/"//g'`
TAGS=`curl -s https://registry.hub.docker.com/v2/gitlab/gitlab-runner-helper/tags/list -H "Authorization: Bearer $TOKEN" | jq ".tags[]" | sed 's/"//g' | grep x86_64`

for tag in $TAGS;
do
  # is $tag an old entry?
  if grep -Fxq $tag tags.list
  then
    # already processed
    continue
  else
    echo "new tag found: $tag"
    newSHA=`curl -s https://registry.hub.docker.com/v2/gitlab/gitlab-runner-helper/manifests/$tag -H "Authorization: Bearer $TOKEN" | jq ".fsLayers[] .blobSum" | sed 's/"//g'`
    latestSHA=`curl -s https://registry.hub.docker.com/v2/gitlab/gitlab-runner-helper/manifests/x86_64-latest -H "Authorization: Bearer $TOKEN" | jq ".fsLayers[] .blobSum" | sed 's/"//g'`
    if [ "$newSHA" = "$latestSHA" ]
    then
      echo "$tag is new latest version"
      docker pull gitlab/gitlab-runner-helper:$tag
      echo $tag >> tags.list
    fi
  fi
done

The above script utilizes a file named tags.list, that is placed next to it. This file contains the older tags, to prevent issuing 500+ HTTP requests. If a tag from the TAGS is not yet present in the file, it does not mean, it is the latest. Sometimes tags appear, that eventually will become the latest version. Those tags are probed, but will not be inserted into the file. This might become an issue in the future, if those versions will be skipped as latest.

Note: The script above only focuses on a specific subset of tags (x86_64).