How to calculate weighted average for 0 values

2019-06-12 21:08发布

问题:

I am facing an issue in performing a weighted average logic in excel.
I am looking at 4 fields for different deliverables: Total, Complete, Pending and Weight.

The weighted average for a particular deliverable is calculated as : (Complete/Total) * Weight for example ((5/10) * 0.20) = 10%

For each of the deliverable, I have calculated the % and then added all of the % together.

Deliverable 1 - 10 = 10% + 20% + 5% + .... = 65%

My question is:
if for a particular deliverable, the available field is 'Not Applicable' or '0', then the weighted average for that deliverable will be 0%. There by, the total will be on a lower side. So, is there a better way to represent this? How can I solve this issue?

回答1:

Ok, so if I am understanding you correctly you are saying the following:

(complete/total) * weight = weighted_average

Where this is one of many weighted averages that will be summed to provide a total_deliverable_value.

You have a case where this algorithm outputs 0% for a weighted_average due to a data issue. This is falsely lowering the total_deliverable_value.

Options:

  1. add assumed values for missing data (clearly mark them as assumed rather than recorded fact)
  2. Have a standard weight for missing data. For example, all missing data gets 20%
  3. Go source the missing data
  4. Change how you are calculating the total_deliverable_value, i.e

instead of:

total_deliverable_value = sum(weighted_average::weighted_average)

use something like this:

total_deliverable_value = sum(weighted_average::weighted_average) / count_of_data_values

where count_of_data_values is the number of non-missing data values you had.

Updated based on comments

You have to change the formula for all calculations. So take a current 'perfect' scenario of 100% such as 5x weighted_averages where weighted_average = 0.2 (20%)

Your original calculation would give this:

a) 0.2+0.2+0.2+0.2+0.2 = 1 (100%) 
b) 0.2+NA+0.2+0.2+0.2 = 0.8 (80%)

This would indicate that a is not as good as b (lets not argue whether missing data means that is true), however you wish to show that actually they should be considered almost identical.

The new calculation would give:

a) 1/5 => 0.2 (20%) 
b) 0.8/4 => 0.2 (20%)

Therefore both are the same total_deliverable_value. If you want to scale these back up to 100%. Do this:

total_deliverable_value = (sum(weighted_average::weighted_average) / count_of_data_values) * count_of_expect_values

Now we are looking like this:

a) ((0.2+0.2+0.2+0.2+0.2)/5)*5 = 1 (100%) 
b) ((0.2+NA+0.2+0.2+0.2)/4)*5 = 1 (100%)

UPDATE: Handling inconsistent max weighted_averages

The above all works because a it is assumed that all the individual weighted_averages are of a consistent max value. Such as a max 0.2 for 5 values:

[0.2, 0.1, 0.18, 0.08, 0.05]  ==>  61%

Therefore we can afford to lose a few of the data points and still make a calculation that would enable the total_deliverable_value to be compared:

[0.2, 0.1, NA, NA, 0.05]   ==>   58%

However, if the max values are inconsistent, say, the 1st weight_average max was 0.5, we get a break in our result:

[0.4, 0.09, NA, NA, 0.15]   ==>  106%

Where 100% should be out Max total_deliverable_value. There is a simple solution to this. Use the weighted_average_max value in the formula. We are no longer missing 40% of our values, we are missing 20%:

[0.4 (max 0.5), 0.09 (max 0.1), NA (max 0.1), NA (max 0.1), 0.15 (max 0.2)] ==> 80% 

compared against something similar with all its data points:

[0.4, 0.09, 0.05, 0.03, 0.15] ==> 72%

This the new formula:

total_deliverable_value = (sum(weighted_average::weighted_average) / portion_of_max_values_existing) * total_of_max_values