When training in Caffe, there are Train and Test net outputs for each iteration. I know this is the loss. However, is this the average loss over my batch or the total loss? And is this the same for both Classification and Regression?
For example, if I were to have a batch of 100 training examples and my loss over that iteration is 100, does that mean that the average loss per example is 1?
Train loss is the averaged loss over the last training batch. That means that if you have 100 training examples in your mini-batch and your loss over that iteration is 100, then you have the average loss per example equals to 100.
Test loss is also an averaged loss but over all the test batches. You specify the test batch size and the number of testing iterations. Caffe will take #iter of such mini-batches, evaluate loss for them and provide you an averaged value. If
#test_iter x batch_size == testset_size
, you will have an averaged value across the full test set.