Testing a regression network in caffe

I am trying to count objects in an image using Alexnet.

I have currently images containing 1, 2, 3 or 4 objects per image. For initial checkup, I have 10 images per class. For example in training set I have:

image  label
image1  1
image2  1
image3  1
...
image39 4
image40 4

I used imagenet create script to create a lmdb file for this dataset. Which successfully converted my set of images to lmdb. Alexnet, as an example is converted to a regression model for learning the number of objects in the image by introducing EucledeanLosslayer instead of Softmax Layer. As suggested by many. The rest of the network is the same.

However, despite doing all the above, when I run the model, I received only zeros as output during testing phase(shown below). It did not learn any thing. However, the training loss decreased continuously in each iteration.

I don't understand what mistakes I have made. Can anybody guide me why the predicted values are always 0? And how can I check the regressed values in testing phase, so that to check how many samples are correct and what's the value for each of my image?

The predicted and the actual label of the test dataset is given as :

I0928 17:52:45.585160 18302 solver.cpp:243] Iteration 1880, loss = 0.60498
I0928 17:52:45.585212 18302 solver.cpp:259]     Train net output #0: loss = 0.60498 (* 1 = 0.60498 loss)
I0928 17:52:45.585225 18302 solver.cpp:592] Iteration 1880, lr = 1e-06
I0928 17:52:48.397922 18302 solver.cpp:347] Iteration 1900, Testing net (#0)
I0928 17:52:48.499543 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 1
I0928 17:52:48.499641 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 2
I0928 17:52:48.499660 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 3
I0928 17:52:48.499681 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 4
...

Note: I also created hdf5 format files in-order to have floating labels, i.e. 1.0, 2.0, 3.0 and 4.0. However, when i changed data layer to HDF5 type, i can not crop the image for data-augmentation as being done in alexnet with lmdb layer, as well as normalization. I used the script given on "https://github.com/nikogamulin/caffe-utils/blob/master/hdf5/demo.m" for hdf5 data and followed his steps for using it in my model.

I have updated last layers as such:

layer {
   name: "fc8reg"
   type: "InnerProduct"
   bottom: "fc7"
   top: "fc8reg"
   param {
    lr_mult: 1
    decay_mult: 1
   }
   param {
     lr_mult: 2
     decay_mult: 0
   }
   inner_product_param {
   num_output: 1
   weight_filler {
       type: "gaussian"
       std: 0.01
   }
   bias_filler {
       type: "constant"
       value: 0
   }
   }
 }
 layer {
   name: "accuracy"
   type: "Accuracy"
   bottom: "fc8reg"
   bottom: "label"
   top: "accuracy"
   include {
     phase: TEST
   }
 }
 layer {
   name: "loss"
   type: "EuclideanLoss"
   bottom: "fc8reg"
   bottom: "label"
   top: "loss"
 }

回答1:

Without judging whether your network diverged or not, the obvious mistake you have made is that you shouldn't use a Accuracy layer to test a regression network. It is only for testing a classification network trained by a SoftmaxWithLoss Layer.

In fact, given an image for a network, the Accuracy layer in the network will always sort its input array(here it is bottom: "fc8reg") and choose the index of the maximal value in the array as the predicted label by default.

Since num_output == 1 in fc8reg layer, accuracy layer will always predict index 0 for the input image as its predicted label as you have seen.

At last, you can use a EuclideanLoss layer to test your regression network. This similar problem may also give you some hint.

If you are to print and calculate the regressed values after training, and count the accuracy of the regression network, you can simply write a RegressionAccuracy layer like this.

Or, if your target label only has 4 discrete values {1,2,3,4}, you can still train a classification network for your task.

回答2:

In my opinion, everything is correct, but your network is not converging, which is not a rare hapenning. Your network is actually converging to zero outputs! Maybe most of your samples have 0 as their label.

Also don't forget to include the loss layer only during TRAIN; otherwise, it will learn on test data as wel.