So I am currently using 'accumarray' to find the averages of a range of numbers wich correspond to matching ID's. Ex Input:
ID----Value
1 215
1 336
1 123
2 111
2 246
2 851
My current code finds the unweighted average of the above values, using the ID as the 'seperator' so that I don't get the average for all of the values together as one number, but rather seperate results for just values which have corresponding ID's.
EX Output:
ID----Value
1 224.66
2 402.66
To achieve this I am using this code:
[ID, ~, Groups] = unique(StarData2(:,1),'stable');
app = accumarray(Groups, StarData2(:,2), [], @mean);
With StarData2 being the input of the function. This is working perfectly for my purposes until now, I need to know if accumarray can be made to give me a weighted average, such that each point in app (before the average is found) can be assigned a weight or that the @mean can be replaced with a function that can achieve this. The new input will look like this:
ID----Value----Weight
1 215 12
1 336 17
1 123 11
2 111 6
2 246 20
2 851 18
The new code must do the sum(val(i)*weight(i))/sum(weight) instead of just the standard mean. Thanks for any assistance.
You can use the row index as the "vals" (second input to accumarray
) and define your own function that does the weighted mean on group of the data:
Weights = data(:,3); Vals = data(:,2); % pick your columns here
WeightedMeanFcn = @(ii) sum(Vals(ii).*Weights(ii))/sum(Weights(ii));
wmeans = accumarray(Groups, 1:numel(Groups), [], WeightedMeanFcn)
Demonstration
Starting with data
(the new input with your weights) and your unique
command:
data = [1,215,12; 1,336,17; 1,123,11; 2,111,6; 2,246,20; 2,851,18];
[ID, ~, Groups] = unique(data(:,1),'stable');
The accumarray
usage is as follows (redefine WeightedMeanFcn
every time you change data
!):
>> Weights = data(:,3); Vals = data(:,2); % pick your columns here
>> WeightedMeanFcn = @(ii) sum(Vals(ii).*Weights(ii))/sum(Weights(ii));
>> app = accumarray(Groups, 1:numel(Groups), [], WeightedMeanFcn)
app =
241.1250
475.0909
Checking manually, with the first group:
ig = 1;
sum(data(Groups==ig,2).*data(Groups==ig,3))/sum(data(Groups==ig,3))
ans =
241.1250
Instead of using accumarray
, you can directly compute a weighted mean, or many other functions, quite easily:
nIDs = length(unique(ID));
WeightedMean = zeros(nIDs, 1);
for ii = 1:nIDs
iID = (ID == ii);
WeightedMean(ii) = (Value(iID)' * Weight(iID)) / sum(Weight(iID));
end
Is there a specific reason you wish to do this through accumarray
?
@Naveh - Generally, it is advised to avoid using loops in Matlab.
Specifically, if you have a large set of data with many groups - it can be very slow.
Using accumarray
is the way to go, but defining a function of the indices, as suggested by @chappjc, is error-prone, since in order to be be captured by the anonymous function, you must make sure that
data is not an input to WeightedMeanFcn. It must be defined before
defining WeightedMeanFcn,
as @chappjc says in his comment.
A slight modification to overcome this problem is to use accumarray
twice:
Weights = data(:,3); Vals = data(:,2); % pick your columns here
app = accumarray(Groups, Weights.*vals, [], @mean)./accumarray(Groups, Weights, [], @mean);
Sometimes you may need to replace the []
argument by the size of the required output.
What you are trying to compute is not a weighted mean, but rather a weighted histogram.
There is a mex implementation of weighted histogram that can be found here. Though, accumarray
is the safe way to go about.