Source code tree (R
) for my dissertation research software reflects traditional research workflow: "collect data -> prepare data -> analyze data -> collect results -> publish results". I use make
to establish and maintain the workflow (most of the project's sub-directories contain Makefile
files).
However, frequently, I need to execute individual parts of my workflow via particular Makefile targets in project's sub-directories (not via top-level Makefile
). This creates a problem of setting up Makefile
rules to maintain dependencies between targets from different parts of the workflow, in other words - between targets in Makefile
files, located in different sub-directories.
The following represents the setup for my dissertation project:
+-- diss-floss (Project's root)
|-- import (data collection)
|-- cache (R data objects (), representing different data sources, in sub-directories)
|-+ prepare (data cleaning, transformation, merging and sampling)
|-- R modules, including 'transform.R'
|-- analysis (data analyses, including exploratory data analysis (EDA))
|-- R modules, including 'eda.R'
|-+ results (results of the analyses, in sub-directories)
|-+ eda (*.svg, *.pdf, ...)
|-- ...
|-- present (auto-generated presentation for defense)
Snippets of targets from some of my Makefile
files:
"~/diss-floss/Makefile" (almost full):
# Major variable definitions
PROJECT="diss-floss"
HOME_DIR="~/diss-floss"
REPORT={$(PROJECT)-slides}
COLLECTION_DIR=import
PREPARATION_DIR=prepare
ANALYSIS_DIR=analysis
RESULTS_DIR=results
PRESENTATION_DIR=present
RSCRIPT=Rscript
# Targets and rules
all: rprofile collection preparation analysis results presentation
rprofile:
R CMD BATCH ./.Rprofile
collection:
cd $(COLLECTION_DIR) && $(MAKE)
preparation: collection
cd $(PREPARATION_DIR) && $(MAKE)
analysis: preparation
cd $(ANALYSIS_DIR) && $(MAKE)
results: analysis
cd $(RESULTS_DIR) && $(MAKE)
presentation: results
cd $(PRESENTATION_DIR) && $(MAKE)
## Phony targets and rules (for commands that do not produce files)
#.html
.PHONY: demo clean
# run demo presentation slides
demo: presentation
# knitr(Markdown) => HTML page
# HTML5 presentation via RStudio/RPubs or Slidify
# OR
# Shiny app
# remove intermediate files
clean:
rm -f tmp*.bz2 *.Rdata
"~/diss-floss/import/Makefile":
importFLOSSmole: getFLOSSmoleDataXML.R
@$(RSCRIPT) $(R_OPTS) $<
...
"~/diss-floss/prepare/Makefile":
transform: transform.R
$(RSCRIPT) $(R_OPTS) $<
...
"~/diss-floss/analysis/Makefile":
eda: eda.R
@$(RSCRIPT) $(R_OPTS) $<
Currently, I am concerned about creating the following dependency:
Data, collected by making a target from Makefile
in import
, always needs to be transformed by making corresponding target from Makefile
in prepare
before being analyzed via, for example eda.R
. If I manually run make
in import
and then, forgetting about transformation, run make eda
in analyze
, things are not going too well. Therefore, my question is:
How could I use features of the make
utility (in a simplest way possible) to establish and maintain rules for dependencies between targets from Makefile
files in different directories?