I would like to create an automated knitr report that will produce histograms for each numeric field within my dataframe. My goal is to do this without having to specify the actual fields (this dataset contains over 70 and I would also like to reuse the script).
I've tried a few different approaches:
- saving the plot to an object,
p
, and then calling p
after the loop
- This only plots the final plot
- Creating an array of plots,
PLOTS <- NULL
, and appending the plots within the loop PLOTS <- append(PLOTS, p)
- Accessing these plots out of the loop did not work at all
- Even tried saving each to a
.png
file but would rather not have to deal with the overhead of saving and then re-accessing each file
I'm afraid the intricacies of the plot devices are escaping me.
Question
How can I make the following chunk output each plot within the loop to the report? Currently, the best I can achieve is output of the final plot produced by saving it to an object and calling that object outside of the loop.
R markdown chunk using knitr
in RStudio:
```{r plotNumeric, echo=TRUE, fig.height=3}
suppressPackageStartupMessages(library(ggplot2))
FIELDS <- names(df)[sapply(df, class)=="numeric"]
for (field in FIELDS){
qplot(df[,field], main=field)
}
```
From this point, I hope to customize the plots further.
包裹qplot
在print
。
knitr
会为你做,如果qplot
是一个循环之外,但(至少我的版本已经安装),没有检测到这种循环(这是与R命令行的行为是一致的)里面。
我正在使用降价孩子RMD文件,也能在sweave。
在RMD使用下面的代码片段:
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
```
DA-numeric.Rmd样子:
Variabele `r num_var_names[i]`
------------------------------------
Missing : `r sum(is.na(data[[num_var_names[i]]]))`
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`
```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```
``` {r eval=TRUE, fig.width=6, fig.height=2}
library(ggplot2)
v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))
hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))
```
在Github上查看我的datamineR包https://github.com/hugokoopmans/dataMineR
要添加一个快速注:不知怎的,我GOOGLE了同样的问题,进入这个页面。 现在,在2018年,只是使用print()
的循环。
for (i in 1:n){
...
f <- ggplot(.......)
print(f)
}
作为除了雨果的出色答卷,我相信,在2016年你需要包括一个print
命令,以及 :
```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
out = c(out, knit_child('da-numeric.Rmd'))
}
`r paste(out, collapse = '\n')`
```