可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I need to resize some 3D data, like in the tf.image.resize_images method for 2d data.

I was thinking I could try and run tf.image.resize_images on it in a loop and swap axes, but I thought there must be an easier way. Simple nearest neighbour should be fine.

Any ideas? It's not ideal, but I could settle for the case where the data is just 0 or 1 and use something like:

tf.where(boolMap, tf.fill(data_im*2, 0), tf.fill(data_im*2), 1)

But I'm not sure how to get boolMap. Would use of tf.while_loop to go over all the values dramatically decrease performance? i feel like it would on GPU unless the have some kind of automatic loop parallelisation.

The data is a tensor of size [batch_size, width, height, depth, 1]

Thanks in advance.

N.B The output dimensions should be:

[batch_size, width*scale, height*scale, depth*scale, 1]

I have come up with this:

def resize3D(self, input_layer, width_factor, height_factor, depth_factor):
    shape = input_layer.shape
    print(shape)
    rsz1 = tf.image.resize_images(tf.reshape(input_layer, [shape[0], shape[1], shape[2], shape[3]*shape[4]]), [shape[1]*width_factor, shape[2]*height_factor])
    rsz2 = tf.image.resize_images(tf.reshape(tf.transpose(tf.reshape(rsz1, [shape[0], shape[1]*width_factor, shape[2]*height_factor, shape[3], shape[4]]), [0, 3, 2, 1, 4]), [shape[0], shape[3], shape[2]*height_factor, shape[1]*width_factor*shape[4]]), [shape[3]*depth_factor, shape[2]*height_factor])

    return tf.transpose(tf.reshape(rsz2, [shape[0], shape[3]*depth_factor, shape[2]*height_factor, shape[1]*width_factor, shape[4]]), [0, 3, 2, 1, 4])

Which turns:

into:

I believe nearest neighbour shouldn't have the stair-casing effect (I intentionally removed the colour).

Hars answer works correctly, however I would like to know whats wrong with mine if anyone can crack it.

回答1:

My approach to this would be to resize the image along two axis, in the code I paste below, I resample along depth and then width

def resize_by_axis(image, dim_1, dim_2, ax, is_grayscale):

    resized_list = []


    if is_grayscale:
        unstack_img_depth_list = [tf.expand_dims(x,2) for x in tf.unstack(image, axis = ax)]
        for i in unstack_img_depth_list:
            resized_list.append(tf.image.resize_images(i, [dim_1, dim_2],method=0))
        stack_img = tf.squeeze(tf.stack(resized_list, axis=ax))
        print(stack_img.get_shape())

    else:
        unstack_img_depth_list = tf.unstack(image, axis = ax)
        for i in unstack_img_depth_list:
            resized_list.append(tf.image.resize_images(i, [dim_1, dim_2],method=0))
        stack_img = tf.stack(resized_list, axis=ax)

    return stack_img

resized_along_depth = resize_by_axis(x,50,60,2, True)
resized_along_width = resize_by_axis(resized_along_depth,50,70,1,True)

Where x will be the 3-d tensor either grayscale or RGB; resized_along_width is the final resized tensor. Here we want to resize the 3-d image to dimensions of (50,60,70)

回答2:

A tensor is already 4D, with 1D allocated to 'batch_size' and the other 3D allocated for width, height, depth. If you are looking to process a 3D image and have batches of them in this configuration

[batch_size, width, height, depth, 1]

then use the squeeze function to remove to unnecessary final dimension like so:

tf.squeeze(yourData, [4])

This will output a tensor or shape

[batch_size, width, height, depth]

Which is what tensorflow will uses gracefully.

addition

If you have the dimensions handy and want to use the reshape capability of tensorflow instead you could like so :

reshapedData = tf.reshape(yourData, [batch_size, width, height, depth])

Personally, I'd use squeeze to declare to the next programmer that your code only intends to get rid of dimensions of size 1 whereas reshape could me so much more and would leave the next dev having to try to figure out why you are reshaping.

update to include the changing 4th dimension

You would like to sometimes use the dimension [batch_size, width, height, depth, 1] and sometimes use [batch_size, width, height, depth, n]

No problem. It is the same solution, but now you can't use squeeze and instead are just left with reshape like so:

reshapedData = tf.reshape(yourData, [batch_size, width, height, depth*n])

How could this work? Let's imagine that depth is the number of image frames and n is the color depth (possibly 3 for RGB). The reshape will stack the color frames one after the other. Your tensorflow no doubt has a convolution layer immediately after the input. The convolution layer will process your stack of color frames as easily as your monochrome frames (albeit with more computing power and parameters).

and addition of scaling

Okay, here is how to scale the image, use tf.image.resize_images after resizing like so:

reshapedData = tf.image.resize_images( tf.reshape(yourData, [batch_size, width, height, depth*n]) , new_size )

where size is a 2D tensor if [ new_height, new_width ], or in your case [ width * scale , height * scale ]

new_size = tf.constant( [ width * scale , height * scale ] )

and then back to the original

If after all this resizing of the image you want it to again be in the shape : [batch_size, width, height, depth, n] then simple use this code

tf.reshape(yourData, [batch_size, width*scale, height*scale, depth,n])

last addition to add address scaling the depth also

Here is my solution :

we'll want to reshape this matrix, and expand it similar how a 3d matrix is expanded in numpy like this

a = np.array([[1, 2, 3, 4, 5, 6, 7, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27],[1, 2,3, 4, 5, 6, 7, 8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]])
print a.reshape([2,3,3,3])
a.reshape([54,1]).dot(np.ones([1,8])).reshape([2,3,3,3,2,2,2]).transpose([0,1,6,2,5,3,4]).reshape([2,6,6,6])
print a

Here is the tensorflow code

isolate = tf.transpose(yourdata,[0,4,1,2,3])  # [batch_size,n,width,height,depth]
flatten_it_all = tf.reshape([batch_size * n * width * height * depth , 1])  # flatten it

expanded_it = flatten_it_all * tf.ones( [1,8] )
prepare_for_transpose = tf.reshape( expanded_it , [batch_size*n,width,height,depth,2,2,2] )

transpose_to_align_neighbors = tf.transpose( prepare_for_transpose, [0,1,6,2,5,3,4])
expand_it_all = tf.reshape( transpose_to_align_neighbors , [batch_size,n,width*2,height*2,depth*2] )

#### - removing this section because the requirements changed
# do a conv layer here to 'blend' neighbor values like:
# averager = tf.ones([2,2,2,1,1]) * 1. / 8.
# tf.nn.conf3d( expand_it_all , averager , padding="SAME")
# for n = 1.  for n = 3, I'll leave it to you.

# then finally reorder and you are done
reorder_dimensions = tf.transpose(expand_it_all,[0,2,3,4,1])  # [batch_size,width*2,height*2,depth*2,n]