MATLAB Accumarray weighted mean

2019-01-24 14:17发布

问题:

So I am currently using 'accumarray' to find the averages of a range of numbers wich correspond to matching ID's. Ex Input:

ID----Value
1     215
1     336
1     123
2     111
2     246
2     851

My current code finds the unweighted average of the above values, using the ID as the 'seperator' so that I don't get the average for all of the values together as one number, but rather seperate results for just values which have corresponding ID's. EX Output:

ID----Value
1     224.66
2     402.66

To achieve this I am using this code:

[ID, ~, Groups] = unique(StarData2(:,1),'stable');
app = accumarray(Groups, StarData2(:,2), [], @mean);

With StarData2 being the input of the function. This is working perfectly for my purposes until now, I need to know if accumarray can be made to give me a weighted average, such that each point in app (before the average is found) can be assigned a weight or that the @mean can be replaced with a function that can achieve this. The new input will look like this:

ID----Value----Weight
1     215     12
1     336     17
1     123     11
2     111     6
2     246     20
2     851     18

The new code must do the sum(val(i)*weight(i))/sum(weight) instead of just the standard mean. Thanks for any assistance.

回答1:

You can use the row index as the "vals" (second input to accumarray) and define your own function that does the weighted mean on group of the data:

Weights = data(:,3); Vals = data(:,2); % pick your columns here
WeightedMeanFcn = @(ii) sum(Vals(ii).*Weights(ii))/sum(Weights(ii));
wmeans = accumarray(Groups, 1:numel(Groups), [], WeightedMeanFcn)

Demonstration

Starting with data (the new input with your weights) and your unique command:

data = [1,215,12; 1,336,17; 1,123,11; 2,111,6; 2,246,20; 2,851,18];
[ID, ~, Groups] = unique(data(:,1),'stable');

The accumarray usage is as follows (redefine WeightedMeanFcn every time you change data!):

>> Weights = data(:,3); Vals = data(:,2); % pick your columns here
>> WeightedMeanFcn = @(ii) sum(Vals(ii).*Weights(ii))/sum(Weights(ii));
>> app = accumarray(Groups, 1:numel(Groups), [], WeightedMeanFcn)
app =
  241.1250
  475.0909

Checking manually, with the first group:

ig = 1;
sum(data(Groups==ig,2).*data(Groups==ig,3))/sum(data(Groups==ig,3))
ans =
  241.1250


回答2:

Instead of using accumarray, you can directly compute a weighted mean, or many other functions, quite easily:

nIDs = length(unique(ID));
WeightedMean = zeros(nIDs, 1);

for ii = 1:nIDs
    iID = (ID == ii);
    WeightedMean(ii) = (Value(iID)' * Weight(iID)) / sum(Weight(iID));
end

Is there a specific reason you wish to do this through accumarray?



回答3:

@Naveh - Generally, it is advised to avoid using loops in Matlab. Specifically, if you have a large set of data with many groups - it can be very slow.

Using accumarrayis the way to go, but defining a function of the indices, as suggested by @chappjc, is error-prone, since in order to be be captured by the anonymous function, you must make sure that

data is not an input to WeightedMeanFcn. It must be defined before defining WeightedMeanFcn,

as @chappjc says in his comment.

A slight modification to overcome this problem is to use accumarray twice:

Weights = data(:,3); Vals = data(:,2); % pick your columns here    
app = accumarray(Groups, Weights.*vals, [], @mean)./accumarray(Groups, Weights, [], @mean);

Sometimes you may need to replace the [] argument by the size of the required output.



回答4:

What you are trying to compute is not a weighted mean, but rather a weighted histogram.
There is a mex implementation of weighted histogram that can be found here. Though, accumarray is the safe way to go about.