accumarray()'s val
argument must be a vector. In my case I need columns of a matrix to be summed (or averaged). Is there a function or a method to achieve this?
What I am doing now is in a for loop I am summing column values separately:
for iCol = 1:nCols
means(:,iCol) = accumarray(labels', X(:,iCol));
end
One solution is to replicate the row indices in labels
and add another column of column indices. Then you can reshape X
into a column vector and apply accumarray
once:
labels = [repmat(labels(:),nCols,1) ... % Replicate the row indices
kron(1:nCols,ones(1,numel(labels))).']; % Create column indices
totals = accumarray(labels,X(:)); % I used "totals" instead of "means"
How it works...
A = accumarray(subs,val)
for a column vector subs
and vector val
works by adding the number in val(i)
to the total in row subs(i)
in the output column vector A
. However, subs
can contain more than just row indices. It can contain subscript indices for multiple dimensions to assign values to in the output. This feature is what allows you to handle an input val
that is a matrix instead of a vector.
First, the input for val
can be reshaped into a column vector using the colon operator X(:)
. Next, in order to keep track of which column in the output the values in X(:)
should be placed, we can modify the input subs
to include an additional column index. To illustrate how this works, I'll use these sample inputs:
labels = [3; 1; 1];
X = [1 2 3; ...
4 5 6; ...
7 8 9];
nCols = 3
And here are what the variables in the above code end up looking like:
labels = 3 1 X(:) = 1 totals = 11 13 15
1 1 4 0 0 0
1 1 7 1 2 3
3 2 2
1 2 5
1 2 8
3 3 3
1 3 6
1 3 9
Notice, for example, that the values 1 4 7
that were originally in the first column of X
will only be accumulated in the first column of the output, as denoted by the ones in the first three rows of the second column of labels
. The resulting output should be the same as what you would have gotten by using the code in the question where you loop over each column to perform the accumulation.
Perhaps a more intuitive (maybe more efficient) way borrowed from MATLAB Answers (the original answer assumes column-major inputs so I transposed them):
[xx, yy] = ndgrid(labels,1:size(X, 1));
totals = accumarray([yy(:) xx(:) ], reshape(X.', 1, []));
Example:
X = [1 2 3 4; 5 6 7 8];
labels = [2; 1; 3; 1];
gives
totals = [6 1 3; 14 5 7]
.
If you want to do this row-wise then there's no need to transpose, just:
[xx, yy] = ndgrid(labels,1:size(X, 2));
totals = accumarray([xx(:) yy(:)], X(:));