How to Calculate F1 measure in multi-label classif

I am working on sentence category detection Problem. Where each sentence can belong to multiple categories for Example:

"It has great sushi and even better service."
True Label:  [[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  1.]]
Pred Label:  [[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  1.]]
Correct Prediction!
Output:  ['FOOD#QUALITY' 'SERVICE#GENERAL']

I have implemented a classifier that can predict multiple categories. I have total 587 sentences that belongs to multiple categories. I have calculated the accuracy scores in two ways:

If all labels of an example predicted or not?

code:

print "<------------ZERO one ERROR------------>" 
print "Total Examples:",(truePred+falsePred) ,"True Pred:",truePred, "False Pred:", falsePred, "Accuracy:", truePred/(truePred+falsePred)

Output:

<------------ZERO one ERROR------------>
Total Examples: 587 True Pred: 353 False Pred: 234 Accuracy: 0.60136286201

How many labels are correctly predicted for all examples?

code:

print "\n<------------Correct and inccorrect predictions------------>"
print "Total Labels:",len(total[0]),"Predicted Labels:", corrPred, "Accuracy:", corrPred/len(total[0])

Output:

<------------Correct and inccorrect predictions------------> 
Total Labels: 743 Predicted Labels: 522 Accuracy: 0.702557200538

Problem: These are all the accuracy scores calculated by comparing predicted scores with ground truth labels. But i want to calculate F1 score (using micro averaging), precision and recall as well. I have ground truth labels and i need to match my predictions with those ground truth labels. But, i don't know how do i tackle such type of multi-label classification problem. Can i use scikit-learn or any other libraries in python?

标签： python machine-learning scikit-learn units-of-measurement multilabel-classification

2条回答

smile是对你的礼貌

2楼-- · 2019-07-21 06:47

Have a look at the metrics already available with sklearn and understand them. They are not available for multiclass multilabel classification so you can write your own or map your categories to labels.

[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  1.] => 0
[ 0.  0.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.] => 1
...

You have to understand what this solution implies : if an example have 4 classes and if you have 3 out of the 4 correctly predicted, using an accuracy_score will be the same as a prediction of 0 out of 4 correctly predicted.

It is an error.

Here an example

>>> from sklearn.metrics import accuracy_score
>>> y_pred = [0, 2, 1, 3]
>>> y_true = [0, 1, 2, 3]
>>> accuracy_score(y_true, y_pred)
0.5

0人赞添加讨论(0) 举报

SAY GOODBYE

3楼-- · 2019-07-21 07:03

I made matrix of predicted labels predictedlabel and i already had correct categories to compare my results in y_test. So, i tried the following code:

from sklearn.metrics import classification_report
from sklearn.metrics import f1_score
from sklearn.metrics import roc_auc_score

print "Classification report: \n", (classification_report(y_test, predictedlabel))
print "F1 micro averaging:",(f1_score(y_test, predictedlabel, average='micro'))
print "ROC: ",(roc_auc_score(y_test, predictedlabel))

and i got the following results:

        precision    recall  f1-score   support

      0       0.74      0.93      0.82        57
      1       0.00      0.00      0.00         3
      2       0.57      0.38      0.46        21
      3       0.75      0.75      0.75        12
      4       0.44      0.68      0.54        22
      5       0.81      0.93      0.87       226
      6       0.57      0.54      0.55        48
      7       0.71      0.38      0.50        13
      8       0.70      0.72      0.71       142
      9       0.33      0.33      0.33        33
     10       0.42      0.52      0.47        21
     11       0.80      0.91      0.85       145

     av/total 0.71      0.78      0.74       743

 F1 micro averaging: 0.746153846154
 ROC:  0.77407943841

So, i am calculating my results in this way!

0人赞添加讨论(0) 举报

How to Calculate F1 measure in multi-label classif

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间