I need to find the maximum among values with same labels, in matlab, and I am trying to avoid using for loops.
Specifically, I have an array L
of labels and an array V
of values, same size. I need to produce an array S
which contains, for each value of L
, the maximum value of V
. An example will explain better:
L = [1,1,1,2,2,2,3,3,3,4,4,4,1,2,3,4]
V = [5,4,3,2,1,2,3,4,5,6,7,8,9,8,7,6]
Then, the values of the output array S will be:
s(1) = 9 (the values V(i) such that L(i) == 1 are: 5,4,3,9 -> max = 9)
s(2) = 8 (the values V(i) such that L(i) == 2 are: 2,1,2,8 -> max = 8)
s(3) = 7 (the values V(i) such that L(i) == 3 are: 3,4,5,7 -> max = 7)
s(4) = 8 (the values V(i) such that L(i) == 4 are: 6,7,8,6 -> max = 8)
this can be trivially implemented by traversing the arrays L
and V
with a for loop, but in Matlab for loops are slow, so I was looking for a faster solution. Any idea?
This is a standard job for accumarray
.
Three cases need to be considered, with increasing generality:
- Integer labels.
- Integer labels, specify fill value.
- Remove gaps; or non-integer labels. General case.
Integer labels
You can just use
S = accumarray(L(:), V(:), [], @max).';
In your example, this gives
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7];
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6];
>> S = accumarray(L(:), V(:), [], @max).'
S =
9 8 7 8
Integer labels, specify fill value
If there are gaps between integers in L
, the above will give a 0
result for the non-existing labels. If you want to change that fill value (for example to NaN
), use a fifth input argument in acccumarray
:
S = accumarray(L(:), V(:), [], @max, NaN).';
Example:
>> L = [1 1 1 2 2 2 3 3 3 4 4 4 1 2 3 7]; %// last element changed
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6]; %// same as in your example
>> S = accumarray(L(:), V(:), [], @max, NaN).'
S =
9 8 7 8 NaN NaN 6
Remove gaps; or non-integer labels. General case
When the gaps between integer labels are large, using a fill value may be inefficient. In that case you may want to get only the meaningful values in S
, without fill values, i.e.skip non-existing labels. Also, it may be the case that L
doesn't necessarily contain integers.
These two issues are solved by applying unique
to the labels before using accumarray
:
[~, ~, Li] = unique(L); %// transform L into consecutive integers
S = accumarray(Li(:), V(:), [], @max, NaN).';
Example:
>> L = [1.5 1.5 1.5 2 2 2 3 3 3 4 4 4 1 2 3 7.8]; %// note: non-integer values
>> V = [5 4 3 2 1 2 3 4 5 6 7 8 9 8 7 6 ]; %// same as in your example
>> [~, ~, Li] = unique(L); %// transform L into consecutive integers
>> S = accumarray(Li(:), V(:), [], @max, NaN).'
S =
9 5 8 7 8 6
helper=[L.', V.'];
helper=sortrows(helper,-2);
[~,idx,~]=unique(helper(:,1));
S=helper(idx,2);
What I do is: I join the two arrays as columns. Then I sort them regarding second column with biggest element first. Then I get the idx of the unique Values in L
before I return the corresponding Values from V
.
The solution from Luis Mendo is faster. But as far as I see his solution doesn't work if there is a zero,negative value or a noninteger inside L
:
Luis solution: Elapsed time is 0.722189 seconds.
My solution: Elapsed time is 2.575943 seconds.
I used:
L= ceil(rand(1,500)*10);
V= ceil(rand(1,500)*250);
and ran the code 10000 times.