可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

My output of neural network is table of predicted class probabilities for multi-label classification:

print(probabilities)

|   |      1       |      3       | ... |     8354     |     8356     |     8357     |
|---|--------------|--------------|-----|--------------|--------------|--------------|
| 0 | 2.442745e-05 | 5.952136e-06 | ... | 4.254002e-06 | 1.894523e-05 | 1.033957e-05 |
| 1 | 7.685694e-05 | 3.252202e-06 | ... | 3.617730e-06 | 1.613792e-05 | 7.356643e-06 |
| 2 | 2.296657e-06 | 4.859554e-06 | ... | 9.934525e-06 | 9.244772e-06 | 1.377618e-05 |
| 3 | 5.163169e-04 | 1.044035e-04 | ... | 1.435158e-04 | 2.807420e-04 | 2.346930e-04 |
| 4 | 2.484626e-06 | 2.074290e-06 | ... | 9.958628e-06 | 6.002510e-06 | 8.434519e-06 |
| 5 | 1.297477e-03 | 2.211737e-04 | ... | 1.881772e-04 | 3.171079e-04 | 3.228884e-04 |

I converted it to class labels using a threshold (0.2) for measuring accuraccy of my prediction:

predictions = (probabilities > 0.2).astype(np.int)
print(predictions)

|   | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... |    0 |    0 |    0 |
| 1 | 0 | 0 | ... |    0 |    0 |    0 |
| 2 | 0 | 0 | ... |    0 |    0 |    0 |
| 3 | 0 | 0 | ... |    0 |    0 |    0 |
| 4 | 0 | 0 | ... |    0 |    0 |    0 |
| 5 | 0 | 0 | ... |    0 |    0 |    0 |

Also I have a test set:

print(Y_test)

|   | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... |    0 |    0 |    0 |
| 1 | 0 | 0 | ... |    0 |    0 |    0 |
| 2 | 0 | 0 | ... |    0 |    0 |    0 |
| 3 | 0 | 0 | ... |    0 |    0 |    0 |
| 4 | 0 | 0 | ... |    0 |    0 |    0 |
| 5 | 0 | 0 | ... |    0 |    0 |    0 |

Question: How to build an algorithm in Python that will choose the optimal threshold that maximize roc_auc_score(average = 'micro') or another metrics?

Maybe it is possible to build manual function in Python that optimize threshold, depending on the accuracy metric.

回答1:

I assume your groundtruth labels are Y_test and predictions are predictions.

Optimizing roc_auc_score(average = 'micro') according to a prediction threshold does not seem to make sense as AUCs are computed based on how predictions are ranked and therefore need predictions as float values in [0,1].

Therefore, I will discuss accuracy_score.

You could use scipy.optimize.fmin:

def thr_to_accuracy(thr, Y_test, predictions):
   return -accuracy_score(Y_test, np.array(predictions>thr, dtype=np.int))

best_thr = scipy.optimize.fmin(thr_to_accuracy, args=(Y_test, predictions), x0=0.5)

回答2:

the best way to do so is to put a logistic regression on top of your new dataset. It will multiply every probability by a certain constant and thus will provide an automatic threshold on the output (with the LR you just need to predict the class not the probabilities)

You need to train this by subdividing the Test set in two and use one part to train the LR after predicting the output with the NN.

This is not the only way to do it, but it works fine for me everytime.

we have X_train_nn,X_valid_nn,X_test_NN and we subdivide X_test_NN in X_train_LR, X_test_LR (or do a Stratified Kfold as you wish) here is a sample of the code

X_train = NN.predict_proba(X_train_LR)
X_test = NN.predict_proba(X_test_LR)
logistic = linear_model.LogisticRegression(C=1.0, penalty = 'l2')
logistic.fit(X_train,Y_train)
logistic.score(X_test,Y_test)

You condider you output as a new dataset and train a LR on this new dataset.