I have a bunch of streaming metrics (tf.metrics.accuracy
and custom streaming micro
, macro
and weighted
F1-scores).
During training, I get the kind of plot below (nevermind the overfitting).
This happens because to compute the validation set's metrics I call tf.local_variables_initializer
to reset the metrics and only have a value for the validation set.
This implies 2 side effects:
- The spikes in the image
- In between validations, training metrics keep aggregating even if validation happens every 2 epochs
I could partially solve the situation by having different tensors hold each metric (train vs val). But It would not solve 2.
I therefore have 2 questions:
- In your experience, is it a behavior you expect to see (or not? solution?)
- Is there a way to have metrics stream only over the last
n
batches?
This behaviour is expected if you reset the metrics in between training. The train metrics dont agregrate the validation metrics if they are two different ops. I will give an example on how to keep those metrics different and how to reset only one of them.
A toy Example:
Training:
The initial states are
0.0
as expected.Now calling the training op metrics:
Only the training accuracy got updated while the valid accuracy is still
0.0
. Calling the valid ops:Here the valid accuracy got updated to a new value while the training accuracy remained unchanged.
Lets reset only the validation ops:
The valid accuracy got reset to zero while the training accuracy remained unchanged.