I'm able to train a U-net with labeled images that have a binary classification.
But I'm having a hard time figuring out how to configure the final layers in Keras/Theano for multi-class classification (4 classes).
I have 634 images and corresponding 634 masks that are unit8
and 64 x 64 pixels.
My masks, instead of being black (0) and white (1), have color labeled objects in 3 categories plus background as follows:
- black (0), background
- red (1), object class 1
- green (2), object class 2
- yellow (3), object class 3
Before training runs, the array containing masks is one-hot encoded as follows:
mask_train = to_categorical(mask_train, 4)
This makes mask_train.shape
go from (634, 1, 64, 64)
to (2596864, 4)
.
My model closely follows the Unet architecture, however the final layers seem problematic, as I'm unable to flatten the structure so as to match the one-hot encoded array.
[...]
up3 = concatenate([UpSampling2D(size=(2, 2))(conv7), conv2], axis=1)
conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(up3)
conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv8)
up4 = concatenate([UpSampling2D(size=(2, 2))(conv8), conv1], axis=1)
conv9 = Conv2D(64, (3, 3), activation='relu', padding='same')(up4)
conv10 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv9)
# here I used number classes = number of filters and softmax although
# not sure if a dense layer should be here instead
conv11 = Conv2D(4, (1, 1), activation='softmax')(conv10)
model = Model(inputs=[inputs], outputs=[conv11])
# here categorical cross entropy is being used but may not be correct
model.compile(optimizer='sgd', loss='categorical_crossentropy',
metrics=['accuracy'])
return model
Do you have any suggestions on how to modify the final portions of the model so this trains successfully? I get a variety of shape mismatch errors, and the few times I managed to make it run, the loss did not change throughout epochs.
Bit late but you should try
That will result in
(634, 4, 64, 64)
formask_train.shape
and a binary mask for each individual class (one-hot encoded).Last conv layer, activation and loss looks good for multiclass segmentation.
You should have your target as
(634,4,64,64)
if you're using channels_first.Or
(634,64,64,4)
if channels_last.Each channel of your target should be one class. Each channel is an image of 0's and 1's, where 1 means that pixel is that class and 0 means that pixel is not that class.
Then, your target is 634 groups, each group containing four images, each image having 64x64 pixels, where pixels 1 indicate the presence of the desired feature.
I'm not sure the result will be ordered correctly, but you can try:
If the ordering doesn't work properly, you can do it manually: