What would be a proper way to extract function calls and their occuring line in one or more R scripts? Is there a parsing base function or package that allows me to do this or should I build a solution with regular expressions?
For example:
function_calls("project1/exploratory_analysis.R")
should output a dataframe like:
## function line filename
## 1 tapply 35 exploratory_analysis.R
## 2 qplot 80 exploratory_analysis.R
What I want to achieve finally is to build a reverse index of function calls and loaded packages as used in one or more R scripts, for educational and reference purposes. (e.g. used as a repository with examples of usage). For example:
--------------------------------------------------------
| function | source_file | line | package |
|:--------:|:----------------------:|:-----:|:--------:|
| tapply | exploratory_analysis.R | 35 | base |
| qplot | exploratory_analysis.R | 80 | ggplot2 |
| cor | regression.R | 15 | stats |
| cor | regression.R | 27 | stats |
| tapply | regression.R | 12 | base |
| fromJSON | load_dataset.R | 5 | jsonlite |
| %>% | transformation.R | 10 | magrittr |
--------------------------------------------------------
One could use some regular expressions to extract function calls but I was wondering whether there exists a parsing, static code analysis or reflection facility supporting this task.
I suppose combining parse()
, substitute()
, getParseData()
, deParse()
, srcfile()
(or maybe functions in the packages mvbutils
, codetools
) would make it possible, but I'm not familiar enough with their usage to figure it out on my own.