I am trying to use the summarize function within dplyr to calculate summary statistics using a two argument function that passes a table and field name from a connected database. Unfortunately as soon as I wrap the summarize function with another function the results aren't correct. The end table is a dataframe that does not iterate through each row. I'll show the input/output below:
Summary Statistics Function library(dplyr)
data<-iris
data<- group_by(.data = data,Species)
SummaryStatistics <- function(table, field){
table %>%
summarise(count = n(),
min = min(table[[field]], na.rm = T),
mean = mean(table[[field]], na.rm = T, trim=0.05),
median = median(table[[field]], na.rm = T))
}
SummaryStatistics(data, "Sepal.Length")
Output Table--Incorrect, it's just repeating the same calculation
Species count min mean median
1 setosa 50 4.3 5.820588 5.8
2 versicolor 50 4.3 5.820588 5.8
3 virginica 50 4.3 5.820588 5.8
Correct Table/Desired Outcome--This is what the table should look like. When I run the summarize function outsize of the wrapper function, this is what it produces.
Species count min mean median
1 setosa 50 4.3 5.002174 5.0
2 versicolor 50 4.9 5.934783 5.9
3 virginica 50 4.9 6.593478 6.5
I hope this is easy to understand. I just can't grasp as to why the summary statistics work perfectly outside of the wrapper function, but as soon as I pass arguments to it, it will calculate the same thing for each row. Any help would be greatly appreciated.
Thanks, Kev