I am trying to compute means of data frames inside a list using lapply function.
df_list <- list(cars, mtcars)
sapply(df_list, mean)
The above code doesn't seem to work. However when i changed it to:
df_list <- c(cars, mtcars)
sapply(df_list, mean)
The output had the means of all the variables of both data frames.
is there a way to compute the means using the first approach.
use the purrr
library to achieve this...much simpler:
library(purrr)
map(df_list, ~map_if(., is.numeric, mean))
If you want a df to be returned then:
map_df(df_list, ~map_if(., is.numeric, mean))
answer is from here:
why does map_if() not work within a list
credit should go to @Axeman
In base R, you can use rapply
in order to calculate the means of variables contained in a list of data.frames.
# data
df_list <- list(cars, mtcars)
The simplest output is to run rapply
with two arguments, the function and the list of data.frames. The function if(is.numeric(x)) mean(x)
checks if the variable is numeric, and if so, returns the mean.
# returns a vector of means
rapply(df_list, function(x) if(is.numeric(x)) mean(x))
This output destroys the relationship between the variables and their data.frames. If desired, we can return the values in a structure that preserves that of the original object, a nested list of length 2 and inner lists of length 2 and 11.
rapply(df_list, function(x) if(is.numeric(x)) mean(x), how="list")
The resulting structure is probably more complicated than desired. For my taste,
lapply(rapply(df_list, function(x) if(is.numeric(x)) mean(x), how="list"), unlist)
[[1]]
speed dist
15.40 42.98
[[2]]
mpg cyl disp hp drat wt qsec
20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750
vs am gear carb
0.437500 0.406250 3.687500 2.812500
results in a nice balance, a list of length 2 each containing named vectors of the means.