可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a fully convolutional neural network, U-Net, which can be read below.

https://arxiv.org/pdf/1505.04597.pdf

I want to use it to do pixelwise classification of images. I have my training images available in two sizes: 512x512 and 768x768. I am using reflection padding of size (256,256,256,256) in the former in the initial step, and (384,384,384,384) in the latter. I do successive padding before convolutions, to get output of the size of input.

But since my padding is dependant on the image/input's size, I can't build a generalised model (I am using Torch).

How is the padding done in such cases?

I am new to deep learning, any help would be great. Thanks.

回答1:

Your model will only accept images of the size of the first layer. You have to pre-process all of them before forwarding them to the network. In order to do so, you can use:

image.scale(img, width, height, 'bilinear')

img will be the image to scale, width and heightthe size of the first layer of your model (if I'm not mistaken it is 572*572), 'bilinear' is the algorithm it is going to use to scale the image.

Keep in mind that it might be necessary to extract the mean of the image or to change it to BGR (depending on how the model was trained).

回答2:

The first thing to do is to process all of your images to be the same size. The CONV layer input requires all images to be of the specified dimensions.

Caffe allows you a reshape within the prototxt file; in Torch, I think there's a comparable command you can drop at the front of createModel, but I don't recall the command name. If not, then you'll need to do it outside the model flow.