Transfer Learning From a U-Net for Image Segmentat

Just getting started with Conv Nets and trying out an image segmentation problem. I got my hands on 24 images and their masks for the dstl satellite image feature detection competition. (https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/data)

I thought I’d try to follow the tips here https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html but I’m stuck.

I downloaded the pre-trained weights for ZF_UNET_224, the 2nd place winners’ approach to this problem. My image masks contain 5 objects so I popped the final layer and instead of having this:

activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]

spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]

conv2d_46 (Conv2D) (None, 224, 224, 1) 33 spatial_dropout2d_2[0][0]

I have this now:

activation_45 (Activation) (None, 224, 224, 32) 0 batch_normalization_44[0][0]

spatial_dropout2d_2 (SpatialDro (None, 224, 224, 32) 0 activation_45[0][0]

predictions (Conv2D) (None, 224, 224, 5) 10 conv2d_46[0][0]

I’m trying to follow the exact steps from the Keras tutorial but when I do

my_model.fit_generator( train_generator, steps_per_epoch= 4, epochs=10, validation_data=validation_generator )

I get an error message saying

Output of generator should be a tuple (x, y, sample_weight) or (x, y). Found: [[[[1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ] … [1. 1. 1. ] [1. 1. 1. ] [1. 1. 1. ]]

I think what I want is probabilities for each of the pixels in my 224X224 image, so that I can use those to generate masks on the original image but I’m not sure how to go about getting that.

I have 24 8 band input images and their masks which label 5 objects. I want to train this U-Net on these images and place masks on some test images and evaluate them for IoU or weighted log loss. Any help?

Update:

I'm using the same generator as in the Keras tutorial:

   batch_size = 4

    # this is the augmentation configuration we will use for training 

train_datagen = ImageDataGenerator(
            rescale=1./255,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True)

    # this is the augmentation configuration we will use for testing:
    # only rescaling 
test_datagen = ImageDataGenerator(
            rescale=1./255,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True)

    # this is a generator that will read pictures found in
    # subfolers of 'data/train', and indefinitely generate
    # batches of augmented image data train_generator = 
train_datagen.flow_from_directory(
            'data/train',  # this is the target directory
            target_size=(224, 224),  # all images will be resized 
            batch_size=batch_size,
            color_mode='rgb', 
            class_mode=None)  # since we use binary_crossentropy loss, we need binary labels

    # this is a similar generator, for validation data 
 validation_generator = test_datagen.flow_from_directory(
            'data/valid',
            target_size=(224, 224),
            batch_size=batch_size,
            color_mode = 'rgb',
            class_mode=None)

One more thing: my training images have 8 bands but the architecture only accepts 3 bands. I think the generator only leaves 1 band at the end. Not sure how to solve this problem either.

On your error message:

With flow_from_directory(), your ImageDataGenerator infers the class labels from the structure of the directory that contains your images. As in the example, the images should be arranged in subfolder per class.

For your image segmentation problem, the label structure is more complex that just one label per image. The labels are masks with a label per pixel. In general, you want to provide these labels as np arrays to the model during training.

You will not be able to handle your case with flow_from_directory(). One solution would be to write your own custom generator that reads both the images and labels from disk and use that with fit_generator().

Suppose that you have a .csv file with two columns, one column with image names and one column with the path to the corresponding masks:

Then your generator could look something like this (I'm using pandas to read in the .csv file):

from keras.utils import Sequence
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from math import ceil 
import numpy as np
import pandas as pd


class DataSequence(Sequence):
    """
    Keras Sequence object to train a model on larger-than-memory data.
    df: pandas dataframe from the csv file
    data_path: path where your images live
    """

    def __init__(self, df, data_path, batch_size):
        self.batch_size = batch_size
        self.im_list = df['images'].tolist()
        self.mask_list = df['labels'].tolist()

    def __len__(self):
        """Make sure to handle cases where the last batch < batch_size
        return int(math.ceil(len(self.im_list) / float(self.batch_size)))

    def get_batch_images(self, idx, path_list):
        # Fetch a batch of images from a list of paths
        return np.array([load_image(im) for im in path_list[idx * self.batch_size: (1 + idx) * self.batch_size]])

    def __getitem__(self, idx):
        batch_x = self.get_batch_images(idx, self.im_list)
        batch_y = self.get_batch_labels(idx, self.mask_list)
        return batch_x, batch_y

I'm making use of the Keras Sequence object here to write the generator, because this allows for safe multiprocessing which will speed up training. See the docs on this subject.

On your actual question about transfer learning:

You will not be able to use an architecture that was pretrained for 3-channel images on 8-channel images just like that. If you want to use the architecture, you could subsample channels, or perform a dimension reduction from 8 to 3 channels. See also this thread.