Is F1 micro the same as Accuracy?

2020-03-12 02:42发布

问题:

I have tried many examples with F1 micro and Accuracy in scikit-learn and in all of them, I see that F1 micro is the same as Accuracy. Is this always true?

Script

from sklearn import svm
from sklearn import metrics
from sklearn.cross_validation import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import f1_score, accuracy_score

# prepare dataset
iris = load_iris()
X = iris.data[:, :2]
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# svm classification
clf = svm.SVC(kernel='rbf', gamma=0.7, C = 1.0).fit(X_train, y_train)
y_predicted = clf.predict(X_test)

# performance
print "Classification report for %s" % clf
print metrics.classification_report(y_test, y_predicted)

print("F1 micro: %1.4f\n" % f1_score(y_test, y_predicted, average='micro'))
print("F1 macro: %1.4f\n" % f1_score(y_test, y_predicted, average='macro'))
print("F1 weighted: %1.4f\n" % f1_score(y_test, y_predicted, average='weighted'))
print("Accuracy: %1.4f" % (accuracy_score(y_test, y_predicted)))

Output

Classification report for SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape=None, degree=3, gamma=0.7, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)
             precision    recall  f1-score   support

          0       1.00      0.90      0.95        10
          1       0.50      0.88      0.64         8
          2       0.86      0.50      0.63        12

avg / total       0.81      0.73      0.74        30

F1 micro: 0.7333

F1 macro: 0.7384

F1 weighted: 0.7381

Accuracy: 0.7333

F1 micro = Accuracy

回答1:

In classification tasks for which every test case is guaranteed to be assigned to exactly one class, micro-F is equivalent to accuracy. It won't be the case in multi-label classification.



回答2:

I had the same issue so I investigated and came up with this:

Just thinking about the theory, it is impossible that accuracy and the f1-score are the very same for every single dataset. The reason for this is that the f1-score is independent from the true-negatives while accuracy is not.

By taking a dataset where f1 = acc and adding true negatives to it, you get f1 != acc.

>>> from sklearn.metrics import accuracy_score as acc
>>> from sklearn.metrics import f1_score as f1
>>> y_pred = [0, 1, 1, 0, 1, 0]
>>> y_true = [0, 1, 1, 0, 0, 1]
>>> acc(y_true, y_pred)
0.6666666666666666
>>> f1(y_true,y_pred)
0.6666666666666666
>>> y_true = [0, 1, 1, 0, 1, 0, 0, 0, 0]
>>> y_pred = [0, 1, 1, 0, 0, 1, 0, 0, 0]
>>> acc(y_true, y_pred)
0.7777777777777778
>>> f1(y_true,y_pred)
0.6666666666666666