可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I've the following players, each value corresponds to a result in percentage of right answers in a given game.

$players = array
(
    'A' => array(0, 0, 0, 0),
    'B' => array(50, 50, 0, 0),
    'C' => array(50, 50, 50, 50),
    'D' => array(75, 90, 100, 25),
    'E' => array(50, 50, 50, 50),
    'F' => array(100, 100, 0, 0),
    'G' => array(100, 100, 100, 100),
);

I want to be able to pick up the best players but I also want to take into account how reliable a player is (less entropy = more reliable), so far I've come up with the following formula:

average - standard_deviation / 2

However I'm not sure if this is a optimal formula and I would like to hear your thoughts on this. I've been thinking some more on this problem and I've come up with a slightly different formula, here it is the revised version:

average - standard_deviation / # of bets

This result would then be weighted for the next upcoming vote, so for instance a new bet from player C would only count as half a bet.

I can't go into specifics here but this is a project related with the Wisdom of Crowds theory and the Delphi method and my goal is to predict as best as possible the next results weighting past bets from several players.

I appreciate all input, thanks.

回答1:

First off, I would not use Standard Deviation if your data arrays have only a few entries. Use more robust statistical measures like Median Absolute Deviation (MAD), likewise you might want to test using the Median instead of the Average.

This is due to the fact that, if your "knowledge" of players' bets is limited to only a few samples, your data is going to be dominated by outliers, i.e. the player being lucky/unlucky. Statistical means may be entirely inappropriate under those circumstances and you may want to use some form of heuristic approach.

I also assume from your links, that you do not in fact intend to pick the best player but rather based on the players next set of answers "A" want to predict the correct set of answers "C" by weighing "A" based on the players' previous track record.

Of course if there were a good solution to this problem, you could make a killing on the stock market ;-) (The fact that no-one does, should be an indication as to the existence of such a solution).

But getting back to ranking the players. Your main problem is that you (have to?) take the percentage of right answers as evenly distributed from 0--100%. If the test contains multiple questions this is certainly not the case. I would look at what a completely random player "R" scores on the test and build up a relative confidence number based on how much better/worse than "R" a given real player is.

Say, for each round of the game generate a million random players and look at the distribution of scores. Use the distribution as a weight for the players' real scores. Then combine the weighted scores using MAD and calculate the Median - MAD / some number, like you already suggested.

回答2:

You can't get an optimal formula if you haven't quantified what is better. You need to figure out how do you want to weigh consistency against average. For example one option would be to estimate the score that the player will hit a given percentage of games. This requires some kind of model of the probability distribution of the players score. For instance, if we assume that the players scores follow the normal distribution, then your given formula calculates what score the player will surpass about 70% of the time.

回答3:

Would a Bayesian Probablity Formula fit the bill?

I think it would. Here is a link to another site that is a little less mathematical about it: http://www.experiment-resources.com/bayesian-probability.html

Essentially you are predicting the probability that each player will score the highest in the next round. This is what bayesian probabilities eat for breakfast.

Bayesian probabilities are already in use in video games (warning: .doc file) to determine stuff just like this.

回答4:

Hm. This would make a (100,100,100,60) player being rated worse than a (85,85,85,85) player. Why not also take the % of total points into account?

Like: percentage total points (e.g. 0..1) multiplied by your current calculation.

回答5:

Have you considered just using the median? It's considered a more robust statistic (less affected by outliers) than the mean. In your data, you get medians of: 0, 25, 50, 82.5, 50, 50, 100.

Does that seem to be what you intuitively want? I agree with others that there's no "right answer" here.

回答6:

I think you may be right that you want some sort of linear combination of the two factors, but I think we'd need to know more about what your doing to know what the actual constants would be...

回答7:

Well, the "simple extension" is just the addition of a weight and a bounds:

average(player) - min(upper, weight * entrophy(player))

However, given the current data-set, I might not be concerned with "right answer percentage" so much as looking at the score difference per game, if that is an option.

回答8:

Check out http://blog.stackoverflow.com/2009/10/alternate-sorting-orders/

The formula in there is to sort voting, but if you consider the score to be similar to voting (0-whatever) you should be able to use it to calculate which players are more consistently scoring higher.