How did they calculate the output volume for this

2019-05-22 17:49发布

In this tutorial, the output volumes are stated in output [25], and the receptive fields are specified in output [26].

Okay, the input volume [3, 227, 227] gets convolved with the region of size [3, 11, 11].

Using this formula (W−F+2P)/S+1, where:
W = the input volume size
F = the receptive field size
P = padding
S = stride

...results with (227 - 11)/4 + 1 = 55 i.e. [55*55*96]. So far so good :)

For 'pool1' they used F=3and S=2 I think? The calculation checks out: 55-3/2+1=27.

From this point I get a bit confused. The receptive field for the second convnet layer is [48, 5, 5], yet the output for 'conv2' is equal to [256, 27, 27]. What calculation happened here?

And then, the height and width of the output volumes of 'conv3' to 'conv4' are all the same [13, 13]? What's going on?

Thanks!

1条回答
别忘想泡老子
2楼-- · 2019-05-22 18:25

If you look closely at the parameters of conv2 layer you'll notice

   pad: 2

That is, the input blob is padded by 2 extra pixels all around, thus the formula now is

27 + 2 + 2 - ( 5 - 1 ) = 27

Padding a kernel size of 5 with 2 pixels from both sides yields the same output size.

查看更多
登录 后发表回答