Pre-training Keras Xception and InceptionV3 models

2019-07-29 04:19发布

问题:

I'm trying to do a simple binary classification problem using Keras and its pre-built ImageNet CNN architecture.

For VGG16, I took the following approach,

vgg16_model = keras.application.vgg16.VGG16()

'''Rebuild the vgg16 using an empty sequential model'''
model = Sequential()
for layer in vgg16_model.layers:
    model.add(layer)

'''Since the problem is binary, I got rid of the output layer and added a more appropriate output layer.'''
model.pop()

'''Freeze other pre-trained weights'''
for layer in model.layers:
    layer.trainable = False

'''Add the modified final layer'''
model.add(Dense(2, activation = 'softmax'))

And this worked marvelously with higher accuracy than my custom built CNN. But it took a while to train and I wanted to take a similar approach using Xception and InceptionV3 since they were lighter models with higher accuracy.

xception_model = keras.applicaitons.xception.Xception()
model = Sequential()
for layer in xception_model.layers:
    model_xception.add(layer)

When I run the above code, I get the following error:

ValueError: Input 0 is incompatible with layer conv2d_193: expected axis -1 of input shape to have value 64 but got shape (None, None, None, 128)

Basically, I would like to do the same thing as I did with VGG16 model; keep the other pretrained weights as they are and simply modify the output layer to a binary classification output instead of an output layer with 1000 outcomes. I can see that unlike VGG16, which has relatively straightforward convolution layer structure, Xception and InceptionV3 have some funky nodes that I'm not 100% familiar with and I'm assuming those are causing issues. If anyone can help out sort the problem, it'd be much appreciated!

Thanks!

回答1:

Your code fails because InceptionV3 and Xception are not Sequential models (i.e., they contain "branches"). So you can't just add the layers into a Sequential container.

Now since the top layers of both InceptionV3 and Xception consist of a GlobalAveragePooling2D layer and the final Dense(1000) layer,

if include_top:
    x = GlobalAveragePooling2D(name='avg_pool')(x)
    x = Dense(classes, activation='softmax', name='predictions')(x)

if you want to remove the final dense layer, you can just set include_top=False plus pooling='avg' when creating these models.

base_model = InceptionV3(include_top=False, pooling='avg')
for layer in base_model.layers:
    layer.trainable = False
output = Dense(2, activation='softmax')(base_model.output)
model = Model(base_model.input, output)