For research purposes I'm trying to crawl the public Docker registry ( https://registry.hub.docker.com/ ) and find out 1) how many layers an average image has and 2) the sizes of these layers to get an idea of the distribution.
However I studied the API and public libraries as well as the details on the github but I cant find any method to:
- retrieve all the public repositories/images (even if those are thousands I still need a starting list to iterate through)
- find all the layers of an image
- find the size for a layer (so not an image but for the individual layer).
Can anyone help me find a way to retrieve this information?
Thank you!
EDIT: is anyone able to verify that searching for '*' in Docker registry is returning all the repositories and not just anything that mentions '*' anywhere? https://registry.hub.docker.com/search?q=*
You can find the layers of the images in the folder /var/lib/docker/aufs/layers; provide if you configured for storage-driver as aufs (default option)
Example:
Now to view the layers of the containers that were created with the image "Ubuntu"; go to /var/lib/docker/aufs/layers directory and cat the file starts with the container ID (here it is 0ca502fa6aae*)
This will show the result of same by running
To view the full layer ID; run with --no-trunc option as part of history command.
https://hub.docker.com/search?q=* shows all the images in the entire Docker hub, it's not possible to get this via the search command as it doesnt accept wildcards.
As of v1.10 you can find all the layers in an image by pulling it and using these commands:
3) The size can be found in
/var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/size
although LAYERID != the diff_ids found with the previous command. For this you need to look at/var/lib/docker/image/aufs/layerdb/sha256/{LAYERID}/diff
and compare with the previous command output to properly match the correct diff_id and size.one more tool : https://github.com/CenturyLinkLabs/dockerfile-from-image
GUI using ImageLayers.io