What is the purpose of the ROI layer in a Fast R-C

2020-05-18 05:25发布

In this tutorial about object detection, the fast R-CNN is mentioned. The ROI (region of interest) layer is also mentioned.

What is happening, mathematically, when region proposals get resized according to final convolution layer activation functions (in each cell)?

标签： deep-learning computer-vision conv-neural-network object-detection

2条回答

甜甜的少女心

2楼-- · 2020-05-18 05:39

Region-of-Interest(RoI) Pooling:

It is a type of pooling layer which performs max pooling on inputs (here, convnet feature maps) of non-uniform sizes and produces a small feature map of fixed size (say 7x7). The choice of this fixed size is a network hyper-parameter and is predefined.

The main purpose of doing such a pooling is to speed up the training and test time and also to train the whole system from end-to-end (in a joint manner).

It's because of the usage of this pooling layer the training & test time is faster compared to original(vanilla?) R-CNN architecture and hence the name Fast R-CNN.

Simple example (from Region of interest pooling explained by deepsense.io):

0人赞添加讨论(0) 举报

淡お忘

3楼-- · 2020-05-18 05:55

ROI (region of interest) layer is introduced in Fast R-CNN and is a special case of spatial pyramid pooling layer which is introduced in Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. The main function of ROI layer is reshape inputs with arbitrary size into a fixed length output because of size constraint in Fully Connected layers.

How ROI layer works is showed below:

In this image, the input image with arbitrary size is fed into this layer which has 3 different window: 4x4 (blue), 2x2 (green), 1x1 (gray) to produce outputs with fixed size of 16 x F, 4 x F, and 1 x F, respectively, where F is the number of filters. Then, those outputs are concatenated into a vector to be fed to Fully Connected layer.

0人赞添加讨论(0) 举报

What is the purpose of the ROI layer in a Fast R-C

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间