Just getting started with Conv Nets and trying out an image segmentation problem. I got my hands on 24 images and their masks for the dstl satellite image feature detection competition. (https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data)
I thought I’d try to follow the tips here https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html but I’m stuck.
I downloaded the pre-trained weights for ZF_UNET_224, the 2nd place winners’ approach to this problem. My image masks contain 5 objects so I popped the final layer and instead of having this:
activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]
spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]
conv2d_46 (Conv2D) (None, 224, 224, 1) 33 spatial_dropout2d_2[0][0]
I have this now:
activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]
spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]
predictions (Conv2D) (None, 224, 224, 5) 10 conv2d_46[0][0]
I’m trying to follow the exact steps from the Keras tutorial but when I do
my_model.fit_generator( train_generator, steps_per_epoch= 4, epochs=10, validation_data=validation_generator )
I get an error message saying
Output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ] … [1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ]]
I think what I want is probabilities for each of the pixels in my 224X224 image, so that I can use those to generate masks on the original image but I’m not sure how to go about getting that.
I have 24 8 band input images and their masks which label 5 objects. I want to train this U-Net on these images and place masks on some test images and evaluate them for IoU or weighted log loss. Any help?
Update:
I'm using the same generator as in the Keras tutorial:
batch_size = 4
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data train_generator =
train_datagen.flow_from_directory(
'data/train', # this is the target directory
target_size=(224, 224), # all images will be resized
batch_size=batch_size,
color_mode='rgb',
class_mode=None) # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
'data/valid',
target_size=(224, 224),
batch_size=batch_size,
color_mode = 'rgb',
class_mode=None)
One more thing: my training images have 8 bands but the architecture only accepts 3 bands. I think the generator only leaves 1 band at the end. Not sure how to solve this problem either.