I am using torch with some semantic segmentation algorithms to produce a binary mask of the segmented images. I would then like to crop the images based on that mask. To be clear I need to crop it on a per pixel basis. It seems like a simple problem but the only solution I can conjure up is to either invert a draw mask
function like in the Coco API, or iterate over each pixel in the array and mask together setting the pixel to black if not needed. I feel like there is a better way of doing this. Libraries in Lua, Python, Go, or C++ will work for me. Any ideas?
问题:
回答1:
I've implemented this in Python, assuming that you have your input image and mask available as Mat Objects. Given that src1 is your image and src1_mask is your binary mask:
src1_mask=cv2.cvtColor(src1_mask,cv2.COLOR_GRAY2BGR)#change mask to a 3 channel image
mask_out=cv2.subtract(src1_mask,src1)
mask_out=cv2.subtract(src1_mask,mask_out)
Now mask_out contains the part of the image src1 located inside the binary mask you defined.
回答2:
Here is a solution relying only on numpy:
def get_segment_crop(img,tol=0, mask=None):
if mask is None:
mask = img > tol
return img[np.ix_(mask.any(1), mask.any(0))]
now execute get_segment_crop(rgb, mask=segment_mask)
where rgb
is an ndarray of shape (w,h,c) and segment_mask
is a boolean ndarray (i.e. containing True/False entries) of shape (w,h), given that w=width, h=height.
回答3:
For anyone else running into this. I found good luck with converting the torch binary mask tensor into type Double
, and then simply multiplying it using torch's cmul
function against each of the RGB channels. Basically, because the binary mask has a 1
in place of a segmented pixel, then the value will just remain. Whereas if it is outside the segmentation it has a 0
which when multiplied across the channels produces black. Saransh's answer is also good, and works well for open cv.
回答4:
Use OpenCV .copyTo with the mask option
http://docs.opencv.org/2.4/modules/core/doc/basic_structures.html#mat-copyto
回答5:
You can use the boundingRect
function from opencv to retrieve the rectangle of interest, and you can crop the image to that rectangle. A python implementation would look something like this:
import numpy as np
import cv2
mask = np.zeros([600,600], dtype=np.uint8)
mask[200:500,200:500] = 255 # set some values to 255 to represent an actual mask
rect = cv2.boundingRect(mask) # function that computes the rectangle of interest
print(rect)
img = np.ones([600,600, 3], dtype=np.uint8) # arbitrary image
cropped_img = img[rect[0]:(rect[0]+rect[2]), rect[1]:(rect[1]+rect[3])] # crop the image to the desired rectangle
substitute mask
an img
with your own