In my Dockerfile
I use curl
or ADD
to download the latest version of an archive like:
FROM debian:jessie
...
RUN apt-get install -y curl
...
RUN curl -sL http://example.com/latest/archive.tar.gz --output archive.tar.gz
...
ADD http://example.com/latest/archive2.tar.gz
...
The RUN
statement that uses curl
or ADD
creates its own image layer. That will be used as a cache for future executions of docker build
.
Question: How can I disable caching for that instructions?
It would be great to get something like cache invalidation working there. E.g. by using HTTP ETags or by querying the last modified header field. That would give the possibility to do a quick check based on the HTTP headers to decide whether a cached layer could be used or not.
I know that some dirty tricks could help e.g. executing a download shell script in the RUN
statement instead. Its filename will be changed before the docker build
is triggered by our build system. And I could do the HTTP checks inside that script. But then I need to store either the last used ETag or the last modified to a file somewhere. I am wondering whether there is some more clean and native Docker functionality that I could use, here.
Passing argument from the build file didn't work with me for some reason. I solved mine by appending the command I don't want to cache to the last CMD instruction.
For example:
Now, I'm running foo.py which I don't want to be cached, then bar.sh. Not clean, but it works.
add
&& exit 0
after a command will invalidate the cache from there.Example:
RUN apt-get install -y unzip && exit 0
docker build --no-cache would invalidate the cache for all the commands.
Dockerfile ADD command used to have the cache invalidated. Although it has been improved in recent docker version:
So if the file added has changed, the cache should be invalidated for the
ADD
command.Issue 1326 mentions other tips:
A build-time argument can be specified to forcibly break the cache from that step onwards. For example, in your Dockerfile, put
and then give this argument a fresh value on every new build. The best, of course, is the timestamp.
Make sure the value is a string without any spaces, otherwise docker client will falsely take it as multiple arguments.
See a detailed discussion on Issue 22832.