I am using Keras 2.0.4 (TensorFlow backend) for an image classification task.
I am trying to train my own network (without any pretrained parameters).
As my data is huge I cannot load all into memory.
For this reason I use ImageDataGenerator()
, flow_from_directory()
and fit_generator()
.
Creating ImageDataGenerator
object:
train_datagen = ImageDataGenerator(preprocessing_function = my_preprocessing_function) # only preprocessing; no augmentation; static data set
my_preprocessing_function rescales images to domain [0,255] and centers data by mean reduction (similar to preprocessing of VGG16 or VGG19)
Use method flow_from_directory()
from the ImageDataGenerator
object:
train_generator = train_datagen.flow_from_directory(
path/to/training/directory/with/five/subfolders,
target_size=(img_width, img_height),
batch_size=64,
classes = ['class1', 'class2', 'class3', 'class4', 'class5'],
shuffle = True,
seed = 1337,
class_mode='categorical')
(The same is done in order to create a validation_generator.)
After defining and compiling the model (loss function: categorical crossentropy
, optimizer: Adam
), I train the model using fit_generator()
:
model.fit_generator(
train_generator,
steps_per_epoch=total_amount_of_train_samples/batch_size,
epochs=400,
validation_data=validation_generator,
validation_steps=total_amount_of_validation_samples/batch_size)
Problem:
There is no error message, but training doesn't perform well.
After 400 epochs, accuracy still oscillates around 20% (which is as good as randomly choosing one of those classes). Indeed, the classifier always predicts 'class1'.
The same holds true after only one epoch of training. Why is this the case although I am initializing random weights?
What is wrong? What am I missing?
U S E D M O D E L
x = Input(shape=input_shape)
# Block 1
x = Conv2D(16, (3, 3), activation='relu', padding='same', name='block1_conv1')(x)
x = Conv2D(16, (5, 5), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(64, (5, 5), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(16, (1, 1), activation='relu', padding='same', name='block3_conv1')(x)
# Block 4
x = Conv2D(256, (3, 3), activation='relu', padding='valid', name='block4_conv1')(x)
x = Conv2D(256, (5, 5), activation='relu', padding='valid', name='block4_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(1024, (3, 3), activation='relu', padding='valid', name='block5_conv1')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
# topping
x = Dense(1024, activation='relu', name='fc1')(x)
x = Dense(1024, activation='relu', name='fc2')(x)
predictions = Dense(5, activation='softmax', name='predictions')(x)