I have the following matrix of depth and temperature data (855 rows, 2 col) and would like to take the mean of every 3 rows within each column. For example:
[1,] -6.7 18.91
[2,] -5.4 18.91
[3,] -4.0 18.59
[4,] -6.7 20.37
[5,] -6.7 20.05
[6,] -2.7 20.21
[7,] -4.0 21.03
[8,] -5.4 20.70
[9,] -4.0 20.87
[10,] -2.7 21.37
[11,] -2.7 21.37
[12,] -2.7 21.37
mean(data[1:3,1])
mean(data[4:6,1])
for the entire matrix. How can I accomplish this without manually writing the code for the mean of every 3 rows? Any ideas or suggestions are greatly appreciated.
Try to use
tapply
andapply
:I really like the 'rollapply' function for this, because its syntax closely matches what you're trying to do. However, I thought I would contribute, for posterity, how you would approach this problem with the 'plyr' package.
Note: You could do this all in one statement, but I've broken it up to make it easier to understand.
Step 1: Set up your data to have a sorting variable.
I've just added a column 'group' that assigns a group number to every three columns. The two matrix columns are now 'X1' and 'X2' by default.
Step 2: Run the 'colMeans' function for each group.
For this specific question, I think the 'plyr' package is sub-optimal, but it's worth noting the method for future reference. The 'apply' family and 'rollapply' functions work best with continuity and consistency in the data. In applications where you want more flexibility, the 'plyr' family functions are useful to have in your toolbox.
Use
rollapply
function from zoo package. See?rollapply
for more details.Example:
If you want the mean for all columns in your matrix, then just add
by.column=TRUE
in therollapply
call: