For research purposes I'm trying to crawl the public Docker registry ( https://registry.hub.docker.com/ ) and find out 1) how many layers an average image has and 2) the sizes of these layers to get an idea of the distribution.
However I studied the API and public libraries as well as the details on the github but I cant find any method to:
- retrieve all the public repositories/images (even if those are thousands I still need a starting list to iterate through)
- find all the layers of an image
- find the size for a layer (so not an image but for the individual layer).
Can anyone help me find a way to retrieve this information?
Thank you!
EDIT: is anyone able to verify that searching for '*' in Docker registry is returning all the repositories and not just anything that mentions '*' anywhere? https://registry.hub.docker.com/search?q=*
Here is a good article about Show Layers of Docker Image
You can first find the image ID:
Then find the its layers and their sizes:
Note: I'm using Docker version 1.13.1
In my opinion,
docker history <image>
is sufficient. This returns the size of each layer.What suprised me is that just changing the owner created a huge blob.
Can check out dive written in golang.
Awesome tool.
You can adjust the source code so that it exports all the info it shows into a
json
file.They have a very good answer here: https://stackoverflow.com/a/32455275/165865
Just run below images:
I've solved this problem by using the search function on Docker's website where '*' is a valid search that returns 200k repositories and then I crawled each invididual page. HTML parsing allows me to extract all the image names on each page.
This will inspect the docker image and print the layers: