I am looking through the Caffe prototxt for deep residual networks and have noticed the appearance of a "Scale"
layer.
layer {
bottom: "res2b_branch2b"
top: "res2b_branch2b"
name: "scale2b_branch2b"
type: "Scale"
scale_param {
bias_term: true
}
}
However, this layer is not available in the Caffe layer catalogue. Can someone explain the functionality of this layer and the meaning of the parameters or point to a an up-to-date documentation for Caffe?
You can find a detailed documentation on caffe here.
Specifically, for
"Scale"
layer the doc reads:It seems like, in your case, (single "bottom"), this layer learns a scale factor to multiply
"res2b_branch2b"
. Moreover, sincescale_param { bias_term: true }
means the layer learns not only a multiplicative scaling factor, but also a constant term. So, the forward pass computes:During training the net tries to learn the values of
\alpha
and\beta
.There's also some documentation on it in the caffe.proto file, you can search for 'ScaleParameter'.
Thanks a heap for your post :) Scale layer was exactly what I was looking for. In case anyone wants an example for a layer that scales by a scalar (0.5) and then "adds" -2 (and those values shouldn't change):
(Probably, the decay_mult's are unnecessary here though. But dunno. See comments.) Other than that:
"param {"
always(?) refers to the weights, the second to bias (lr_mult is not ScaleLayer specific)All taken from caffe.proto. And: I only tested the layer above with both filler values = 1.2.