I have a data frame that looks like this:
#df
ID DRUG FED AUC0t Tmax Cmax
1 1 0 100 5 20
2 1 1 200 6 25
3 0 1 NA 2 30
4 0 0 150 6 65
Ans so on. I want to summarize some statistics on AUC, Tmax and Cmax by drug DRUG
and FED STATUS FED
. I use dplyr. For example: for the AUC:
CI90lo <- function(x) quantile(x, probs=0.05, na.rm=TRUE)
CI90hi <- function(x) quantile(x, probs=0.95, na.rm=TRUE)
summary <- df %>%
group_by(DRUG,FED) %>%
summarize(mean=mean(AUC0t, na.rm=TRUE),
low = CI90lo(AUC0t),
high= CI90hi(AUC0t),
min=min(AUC0t, na.rm=TRUE),
max=max(AUC0t,na.rm=TRUE),
sd= sd(AUC0t, na.rm=TRUE))
However, the output is not grouped by DRUG and FED. It gives only one line containing the statistics of all by not faceted on DRUG and FED.
Any idea why? and how can I make it do the right thing?
I believe you've loaded plyr after dplyr, which is why you are getting an overall summary instead of a grouped summary.
This is what happens with plyr loaded last.
Now remove plyr and try again and you get the grouped summary.
Try sqldf is best way and easy to learn for grouping the data. Below is example to your need.all kinds of data sample grouping sqldf library is very helpful.
Or you could consider using
data.table
A variant of aosmith's answer that might help some folks out. Direct R to call dplyr's functions directly. Good trick when one package interferes with another.