I try to implement Deconvolution layer for a Convolution Network. What I mean by deconvolution is that suppose I have 3x227x227
input image to a layer with filters in size 3x11x11
and stride 4. Hence the resulting feature map has size 55x55
. What I try to do is to apply the reverse operation where I project 55x55
feature map to again 3x227x227
image. Basically each value on 55x55
feature map is weighted by 3x11x11
filters and projected to image space and overlapping regions due to stride is averaged.
I tried to implement it in numpy without any success. I found the solution with a brute-force nested for loops but it is damn slow. How can I implement it in numpy efficiently? Any help is welcome.
As discussed in this question, a deconvolution is just a convolutional layer, but with a particular choice of padding, stride and filter size.
For example, if your current image size is
55x55
, you can apply a convolution withpadding=20
,stride=1
andfilter=[21x21]
to obtain a75x75
image, then95x95
and so on. (I'm not saying this choice of numbers gives the desired quality of the output image, just the size. Actually, I think downsampling from227x227
to55x55
and then upsampling back to227x227
is too aggressive, but you are free to try any architecture).Here's the implementation of a forward pass for any stride and padding. It does im2col transformation, but using
stride_tricks
from numpy. It's not as optimized as modern GPU implementations, but definitely faster than 4 inner loops: