I am trying to craft a custom scorer function for cross-validating my (binary classification) model in scikit-learn (Python).
Some examples of my raw test data:
Source Feature1 Feature2 Feature3
123 0.1 0.2 0.3
123 0.4 0.5 0.6
456 0.7 0.8 0.9
Assuming that any fold might contain multiple test examples that come from the same source...
Then for the set of examples with the same source, I want my custom scorer to "decide" the "winner" to be the example for which the model spit out the higher probability. In other words, there can be only one correct prediction for each source but if my model claims that more than one evaluation example was "correct" (label=1), I want the example with the highest probability to be matched against the truth by my scorer.
My problem is that the scorer function requires the signature:
score_func(y_true, y_pred, **kwargs)
where y_true
and y_pred
contain the probability/label only.
However, what I really need is:
score_func(y_true_with_source, y_pred_with_source, **kwargs)
so I can group the y_pred_with_source
examples by their source and choose the winner to match against that of the y_true_with_source
truth. Then I can carry on to calculate my precision, for example.
Is there a way I can pass in this information in some way? Maybe the examples' indices?