Using mapply with mean function on a matrix

2019-06-27 05:54发布

问题:

I wish to calculate the mean of adjacent values in each column (or row) of a matrix (e.g. mean of [1,1] and [2,1], [2,1] and [3,1], [3,1] and [4,1]) and to apply this across all columns.

I have tried to use the mapply function (to avoid using a for loop), to calculate the mean of the first 2 values in each column, and plan to apply this to the whole matrix row-by-row. However mapply which seems to work if I try to sum the values but not for the mean function.

See example below:

x <- matrix(c(NA,rnorm(28),NA), nrow=6, ncol=5)
print(x)
       [,1]       [,2]       [,3]       [,4]       [,5]
[1,]          NA -0.6557176  1.7741320  0.3667700 -0.5548408
[2,]  0.14001643  0.2521062 -0.1295084 -0.4272368  0.7598425
[3,]  0.32123196  0.5736409  0.8618268  2.1535191  0.4686728
[4,]  0.06573949 -1.2101965 -0.4308219 -0.2624877 -0.3751350
[5,] -0.66247996  1.2743463  1.6044236  1.2004990 -0.3283678
[6,]  1.05005260  1.2264607  3.2347421 -0.8113528         NA

mapply(sum, x[1,], x[2,])
[1]          NA -0.40361136  1.64462358 -0.06046682  0.20500169
# gives the sum of the input of rows 1 and 2 for each column, as expected

mapply(mean, x[1,], x[2,])
[1]         NA -0.6557176  1.7741320  0.3667700 -0.5548408
# gives the actual values across row 1

When using the mean function, the output appears to be the values of the first row. I suspect the problem is in indexing the correct input values.

回答1:

You can use:

library(zoo)
apply(x, 2, function(x) rollapply(x, 2, mean))


回答2:

I think this will do what you want:

(head(x, -1L) + tail(x, -1L)) / 2

Produces (using your data with set.seed(1)):

           [,1]       [,2]        [,3]      [,4]       [,5]
[2,]         NA -0.1665197 -0.11569867 0.8825287 -0.6847630
[3,] -0.2214052  0.6128769 -1.41797023 0.7075613  0.2818485
[4,] -0.3259926  0.6570530 -0.54488448 0.7564393 -0.1059621
[5,]  0.3798261  0.1351965  0.53999865 0.8505568 -0.8132739
[6,]  0.9623943  0.6031964 -0.03056194 0.4283506         NA

tail(x, -1L) gives a matrix with every row but the first. So the first row of the resulting matrix is the 2nd row of the original, the 2nd the 3rd, etc. We then add this to the original matrix minus the first row. This is equivalent to adding the 2nd row to the 1st, the 3rd to the 2nd, etc. Finally we just divide by two, which gives us the average.

The reason your approach is failing is because mean only averages its first argument, unlike sum which sums all its arguments:

> args(mean)
function (x, ...) 
NULL
> args(sum)
function (..., na.rm = FALSE) 
NULL    

sum sums all the ..., but mean only takes the mean of x, so the second row you pass to mean with mapply is getting thrown away (or worse, being used as the trim argument, see ?mean).