When Keras 2.x removed certain metrics, the changelog said it did so because they were "Batch-based" and therefore not always accurate. What is meant by this? Do the corresponding metrics included in tensorflow suffer from the same drawbacks? For example: precision and recall metrics.
相关问题
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- batch_dot with variable batch size in Keras
- How to get the background from multiple images by
- Evil ctypes hack in python
Let's take precision for example. The stateless version which was removed was implemented like so:
Which is fine if
y_true
contains all of the labels in the dataset andy_pred
has the model's predictions corresponding to all of those labels.The issue is that people often divide their datasets into batches, for example evaluating on 10000 images by running 10 evaluations of 1000 images. This can be necessary to fit memory constraints. In this case you'd get 10 different precision scores with no way to combine them.
Stateful metrics solve this issue by keeping intermediate values in variables which last for the whole evaluation. So in the case of
precision
a stateful metric might have a persistent counter fortrue_positives
andpredicted_positives
. TensorFlow metrics are stateful, e.g. tf.metrics.precision.