I found that for generating (X - x + 1, Y - y + 1)
patches of size (x,y)
from (X,Y)
with stride 1, image requires us to give strides parameter as img.strides * 2
or img.strides + img.strides
. I don't know how they quickly compute this knowing the no. of strides in conv2d
But what should I do to get ((X-x)/stride)+1, ((Y-y)/stride)+1
patches of same size from same sized image with stride
stride?
From this SO answer with slight modification, with channels and number of images placed in front
def patchify(img, patch_shape):
a,b,X, Y = img.shape # a images and b channels
x, y = patch_shape
shape = (a, b, X - x + 1, Y - y + 1, x, y)
a_str, b_str, X_str, Y_str = img.strides
strides = (a_str, b_str, X_str, Y_str, X_str, Y_str)
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
I can see that it creates a sliding window with size (x,y) and stride 1 (move 1 pixel to the right and move 1 pixel down). I have trouble correlating the strides parameter which as_strided
uses and the strides we usually use for conv2d.
How do I add a parameter to the above function that computes as_strided
strides parameter?
def patchify(img, patch_shape, stride): # stride=stepsize in conv2d eg: 1,2,3,...
a,b,X,Y = img.shape # a images and b channels
x, y = patch_shape
shape = (a,b,((X-x)/stride)+1, ((Y-y)/stride)+1, x, y)
strides = ??? # strides for as_strided
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
img is 4d (a, b, X, Y)
a
=no.of images,b
=no.of channels,(X,Y)
= width and height
Note: By stride in conv2d
I mean stepsize
Unfortunately this is also called stride.
Note 2: Since stepsize
will usually be the same on both axes, in the code I provided, I've provided only one parameter, however used it for both dimensions.
Playground:
What goes in for strides
here. I have it running for stepsize=1
here. I noticed that it might not work from the link but it works when pasted in new playground.
This should give a clear idea of what I need:
[[ 0.5488135 0.71518937 0.60276338 0.54488318]
[ 0.4236548 0.64589411 0.43758721 0.891773 ]
[ 0.96366276 0.38344152 0.79172504 0.52889492]
[ 0.56804456 0.92559664 0.07103606 0.0871293 ]]
# patch_size = 2x2
# stride = 1,1
[[[[ 0.5488135 0.71518937]
[ 0.4236548 0.64589411]]
[[ 0.71518937 0.60276338]
[ 0.64589411 0.43758721]]
[[ 0.60276338 0.54488318]
[ 0.43758721 0.891773 ]]]
[[[ 0.4236548 0.64589411]
[ 0.96366276 0.38344152]]
[[ 0.64589411 0.43758721]
[ 0.38344152 0.79172504]]
[[ 0.43758721 0.891773 ]
[ 0.79172504 0.52889492]]]
[[[ 0.96366276 0.38344152]
[ 0.56804456 0.92559664]]
[[ 0.38344152 0.79172504]
[ 0.92559664 0.07103606]]
[[ 0.79172504 0.52889492]
[ 0.07103606 0.0871293 ]]]]
# stride = 2,2
[[[[[[ 0.5488135 0.71518937]
[ 0.4236548 0.64589411]]
[[ 0.60276338 0.54488318]
[ 0.43758721 0.891773 ]]]
[[[ 0.96366276 0.38344152]
[ 0.56804456 0.92559664]]
[[ 0.79172504 0.52889492]
[ 0.07103606 0.0871293 ]]]]]]
# stride = 2,1
[[[[ 0.5488135 0.71518937]
[ 0.4236548 0.64589411]]
[[ 0.71518937 0.60276338]
[ 0.64589411 0.43758721]]
[[ 0.60276338 0.54488318]
[ 0.43758721 0.891773 ]]]
[[[ 0.96366276 0.38344152]
[ 0.56804456 0.92559664]]
[[ 0.38344152 0.79172504]
[ 0.92559664 0.07103606]]
[[ 0.79172504 0.52889492]
[ 0.07103606 0.0871293 ]]]]