Processing multiple files generated from single in

2019-08-20 08:40发布

问题:

I have a data file that is processed by a script to produce multiple output files. Each of these output files is then processed further. Which files are created depends on the contents of the input file, so I can't list them explicitly. I can't quite figure out how to refer to the various files that are generated in a makefile.

Currently, I have something like this:

final.out: *.out2
  merge_files final.out $(sort $^)

%.out2: %.out1
  convert_files $?

%.out1: data.in
  extract_data data.in

This fails with No rule to make target '*.out2', needed by 'final.out'. I assume this is because the .out2 files don't exist yet and therefore the wildcard expression isn't replaced the way I would like it to. I have tried to use the wildcard function but that fails because the list of prerequisites ends up being empty.

Any pointers would be much appreciated.

回答1:

EDIT: fixed the list of prerequisites in second pass.

You apparently cannot compute the list of intermediate files before running the extract_data command. In this case a solution consists in running make twice. One first time to generate the *.out1 files and a second time to finish the job. You can use an empty dummy file to mark whether the extract_data command shall be run again or not:

ifeq ($(FIRST_PASS_DONE),)
final.out: .dummy
    $(MAKE) FIRST_PASS_DONE=yes

.dummy: data.in
    extract_data $<
else
OUT1 := $(wildcard *.out1)
OUT2 := $(patsubst %.out1,%.out2,$(OUT1))

final.out: $(OUT2)
    merge_files $@ $(sort $^)

%.out2: %.out1
    convert_files $?
endif


回答2:

Unfortunately your question is missing some details I would ask immediately if some SW developer would present this makefile for review:

  • does extract_files provide the list of files?
  • does convert_files convert one file or multiple? The example seems to imply that it converts multiple.
    • then I have to question the decision to break up extract, convert and merge into separate rules as you will not benefit from parallel build anyway

The following is the approach I would choose. I'm going to use a tar file as an example for an input file that results in multiple output files

  • generate a makefile fragment for the sorted list of files
    • use the tar option v to print files while they are extracted
    • convert each line into a makefile variable assignment
  • include the fragment to define $(DATA_FILES)
    • if the fragment needs to be regenerated, make will restart after it has generated it
  • use static pattern rule for the conversion
  • use the converted file list as dependency for the final target
.PHONY: all
all: final.out

# extract files and created sorted list of files in $(DATA_FILES)
Makefile.data_files: data.tar
    set -o pipefail; tar xvf $< | sort | sed 's/^/DATA_FILES += /' >$@

DATA_FILES :=
include Makefile.data_files

CONVERTED_FILES := $(DATA_FILES:%.out1=%.out2)

$(CONVERTED_FILES): %.out2: %.out1
    convert_files $< >$@

final.out: $(CONVERTED_FILES)
    merge_files final.out $^

UPDATE if extract_data doesn't provide the list of files, you could modify my example like this. But of course that depends on that there are no other files that match *.out1 in your directory.

# extract files and created sorted list of files in $(DATA_FILES)
Makefile.data_files: data.in
    set -o pipefail;   \
    extract_data $< && \
    (ls *.out1 | sort | sed 's/^/DATA_FILES += /') >$@