-->

Average of numbers in two consequetive sequences u

2019-09-05 19:26发布

问题:

I have an array that is 13867 X 2 elements and stored in variable called "data". So, I want to do the following in Matlab:

  • Average (row 1 -> row 21) ; i.e. take the average of first 21 elements
  • Average (row 22 -> row 43) ; i.e. take the average of next 22 elements
  • Average (row 44 -> row 64); i.e. take the average of the next 21 elements
  • Average (row 65 -> row 86); i.e. take the average of the next 22 elements
  • Repeat the process until the end of the matrix, so that we take the average of the last 21 elements from row 13847 to row 13876. I want the average of elements in column 1 and also column 2. I have somehow managed to do that in Excel, but it is a bit cumbersome task (had to create an index for the rows first). I guess at the end we will get 645 averages.

回答1:

The key to doing that is inserting NaN rows to make the shorter blocks (21 rows) the same size as the longer blocks (22 rows). This is very easy, using the insertrows function from Matlab FileExchange:

n = 21;
m = 22;

dataPad = insertrows(data, nan(1,size(data,2)), n:(n+m):size(data,1));

After that, row 22 will be [NaN, NaN], row 66 will be [NaN, NaN], and so on. Now it gets very easy to calculate the mean. Simply reshape this matrix so that all values which should be averaged are on the same column. Finally, use the nanmean function (mean function which simply ignores NaN) to get the result.

It is not 100% clear to me, whether the result should be 645x2 or 645x1, i.e. whether to average over the rows as well, or not. Here would be the corresponding reshape's for both ways:

1. Averaging over the rows too:

dataPadRearr = reshape(dataPad.',m*size(data,2),[]);
result = nanmean(dataPadRearr,1);

2. Leaving the rows alone:

dataPadRearr = reshape(dataPad,m,[],size(data,2));
result = squeeze(nanmean(dataPadRearr,1));

Note that here, you'll need a final squeeze, as the result of nanmean would be of dimension 1x645x2, which is not very practical. squeeze just removes this singleton dimension.



回答2:

Here's one way to solve it with some padding with NaNs, reshaping and concatenations -

%// Input
A = rand(13867,2);

%// Two stepsizes
m = 21;
n = 22;

%// Combined stepsize
N = m+n;

%// Pad with NaNs to simplify reshaping & finding averages with nanmean
Apad = cat(1,A,nan(N*ceil(numel(A)/(2*N)) - numel(A)/2,2));

%// Reshape into a 3D array with Combined stepsize number of rows
B = reshape(Apad,N,numel(Apad)/(2*N),[]);

%// Index into first m rows and get nan ignored averages row-wise. 
%// Reshape into rows x 2 sized array
C = reshape(cat(1,nanmean(B(1:m,:,:),1),nanmean(B(m+1:end,:,:),1)),[],2);

%// Ignore NaNs and thus have the final output
out = reshape(C(~isnan(C)),[],2);

Verify output

First five rows -

>> out(1:4,:)
ans =
      0.55694      0.55289
      0.49942      0.53502
      0.57768      0.40828
       0.6347      0.45194

>> mean(A(1:21,:),1)
ans =
      0.55694      0.55289
>> mean(A(22:43,:),1)
ans =
      0.49942      0.53502
>> mean(A(44:64,:),1)
ans =
      0.57768      0.40828
>> mean(A(65:86,:),1)
ans =
       0.6347      0.45194

Last row -

>> out(end,:)
ans =
      0.44631      0.59432
>> mean(A(13847:13867,:),1)
ans =
      0.44631      0.59432

Explanation with the help of a toy example

Sample used -

%// Input
A = rand(17,2)

%// Two stepsizes
m = 3;
n = 4;

1] Input :

A =
      0.64775      0.30635
      0.45092      0.50851
      0.54701      0.51077
      0.29632      0.81763
      0.74469      0.79483
      0.18896      0.64432
      0.68678      0.37861
      0.18351      0.81158
      0.36848      0.53283
      0.62562      0.35073
      0.78023        0.939
     0.081126      0.87594
      0.92939      0.55016
      0.77571      0.62248
      0.48679      0.58704
      0.43586      0.20774
      0.44678      0.30125

2] Combine step-size :

N =
     7

3] Pad with NaN filled rows such that the number of rows is a multiple of N -

Apad =
      0.64775      0.30635
      0.45092      0.50851
      0.54701      0.51077
      0.29632      0.81763
      0.74469      0.79483
      0.18896      0.64432
      0.68678      0.37861
      0.18351      0.81158
      0.36848      0.53283
      0.62562      0.35073
      0.78023        0.939
     0.081126      0.87594
      0.92939      0.55016
      0.77571      0.62248
      0.48679      0.58704
      0.43586      0.20774
      0.44678      0.30125
          NaN          NaN
          NaN          NaN
          NaN          NaN
          NaN          NaN

4] This part might be a bit tricky. Consider each column from Apad is made into a 2D array, such that we would have N elements per column, because the intention here is to get averages along each column after further slicing each column into two subgroups of first three rows and rest four rows from such a 3D array. So, with Apad having 2 rows, we would have a 3D array with two 3D slices, such that the first 3D slice would be a reshaped version of first column in Apad i.e. of Apad(:,1). Similarly, the second 3D slice corresponds to the second column in Apad. Thus, the ressultant 3D array would be -

B(:,:,1) =
      0.64775      0.18351      0.48679
      0.45092      0.36848      0.43586
      0.54701      0.62562      0.44678
      0.29632      0.78023          NaN
      0.74469     0.081126          NaN
      0.18896      0.92939          NaN
      0.68678      0.77571          NaN
B(:,:,2) =
      0.30635      0.81158      0.58704
      0.50851      0.53283      0.20774
      0.51077      0.35073      0.30125
      0.81763        0.939          NaN
      0.79483      0.87594          NaN
      0.64432      0.55016          NaN
      0.37861      0.62248          NaN

5] Find mean/average along each column with nanmean(..,1) ignoring the NaNs -

>> nanmean(B(1:m,:,:),1)
ans(:,:,1) =
      0.54856      0.39254      0.45648
ans(:,:,2) =
      0.44188      0.56504      0.36534
>> nanmean(B(m+1:end,:,:),1)
ans(:,:,1) =
      0.47919      0.64161          NaN
ans(:,:,2) =
      0.65885      0.74689          NaN

6] Concatenate and reshape those averages into a 2D array -

C =
      0.54856      0.44188
      0.47919      0.65885
      0.39254      0.56504
      0.64161      0.74689
      0.45648      0.36534
          NaN          NaN

7] Ignore NaN rows for final output -

out =
      0.54856      0.44188
      0.47919      0.65885
      0.39254      0.56504
      0.64161      0.74689
      0.45648      0.36534