I'm running the following kind of pipeline:
digestA: hugefileB hugefileC
cat $^ > $@
rm $^
hugefileB:
touch $@
hugefileC:
touch $@
The targets hugefileB and hugefileC are very big and take a long time to compute (and need the power of Make). But once digestA has been created, there is no need to keep its dependencies: it deletes those dependencies to free up disk space.
Now, if I invoke 'make' again, hugefileB and hugefileC will be rebuilt, whereas digestA is already ok.
Is there any way to tell 'make' to avoid to re-comile the dependencies ?
NOTE: I don't want to build the two dependencies inside the rules for 'digestA'.
Use "intermediate files" feature of GNU Make:
Intermediate files are remade using their rules just like all other files. But intermediate files are treated differently in two ways.
The first difference is what happens if the intermediate file does not exist. If an ordinary file b does not exist, and make considers a target that depends on b, it invariably creates b and then updates the target from b. But if b is an intermediate file, then make can leave well enough alone. It won't bother updating b, or the ultimate target, unless some prerequisite of b is newer than that target or there is some other reason to update that target.
The second difference is that if make does create b in order to update something else, it deletes b later on after it is no longer needed. Therefore, an intermediate file which did not exist before make also does not exist after make. make reports the deletion to you by printing a rm -f
command showing which file it is deleting.
Ordinarily, a file cannot be intermediate if it is mentioned in the makefile as a target or prerequisite. However, you can explicitly mark a file as intermediate by listing it as a prerequisite of the special target .INTERMEDIATE
. This takes effect even if the file is mentioned explicitly in some other way.
You can prevent automatic deletion of an intermediate file by marking it as a secondary file. To do this, list it as a prerequisite of the special target .SECONDARY
. When a file is secondary, make will not create the file merely because it does not already exist, but make does not automatically delete the file. Marking a file as secondary also marks it as intermediate.
So, adding the following line to the Makefile should be enough:
.INTERMEDIATE : hugefileB hugefileC
Invoking make for the first time:
$ make
touch hugefileB
touch hugefileC
cat hugefileB hugefileC > digestA
rm hugefileB hugefileC
And the next time:
$ make
make: `digestA' is up to date.
If you mark hugefileB
and hugefileC
as intermediate files, you will get the behavior you want:
digestA: hugefileB hugefileC
cat $^ > $@
hugefileB:
touch $@
hugefileC:
touch $@
.INTERMEDIATE: hugefileB hugefileC
For example:
$ gmake
touch hugefileB
touch hugefileC
cat hugefileB hugefileC > digestA
rm hugefileB hugefileC
$ gmake
gmake: `digestA' is up to date.
$ rm -f digestA
$ gmake
touch hugefileB
touch hugefileC
cat hugefileB hugefileC > digestA
rm hugefileB hugefileC
Note that you do not need the explicit rm $^
command anymore -- gmake automatically deletes intermediate files at the end of the build.
I would recommend you to create pseudo-cache files that are created by the hugefileB
and hugeFileC
targets.
Then have digestA
depend on those cache files, because you know they will not change again until you manually invoke the expensive targets.
The correct way is to not delete the files, as that removes the information that make
uses to determine whether to rebuild the files.
Recreating them as empty does not help because make
will then assume that the empty files are fully built.
If there is a way to merge digests, then you could create one from each of the huge files, which is then kept, and the huge file automatically removed as it is an intermediate.