converting VGG functional model into sequential mo

2020-08-01 13:08发布

问题:

I am actually trying to get a Sequential model version of VGG16 with Keras. The functional version can be obtained with:

from __future__ import division, print_function

import os, json
from glob import glob
import numpy as np
from scipy import misc, ndimage
from scipy.ndimage.interpolation import zoom

from keras import backend as K
from keras.layers.normalization import BatchNormalization
from keras.utils.data_utils import get_file
from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers.pooling import GlobalAveragePooling2D
from keras.optimizers import SGD, RMSprop, Adam
from keras.preprocessing import image
import keras   
import keras.applications.vgg16
from  keras.layers import Input

input_tensor = Input(shape=(224,224,3))
VGG_model=keras.applications.vgg16.VGG16(weights='imagenet',include_top= True,input_tensor=input_tensor)

Its summary goes like this :

VGG_model.summary()

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_1[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 112, 112, 128) 73856       block1_pool[0][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 112, 112, 128) 147584      block2_conv1[0][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 56, 56, 128)   0           block2_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 56, 56, 256)   295168      block2_pool[0][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv1[0][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv2[0][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 28, 28, 256)   0           block3_conv3[0][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 28, 28, 512)   1180160     block3_pool[0][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv1[0][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv2[0][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 14, 14, 512)   0           block4_conv3[0][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 14, 14, 512)   2359808     block4_pool[0][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv1[0][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv2[0][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 7, 7, 512)     0           block5_conv3[0][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
____________________________________________________________________________________________________
predictions (Dense)              (None, 1000)          4097000     fc2[0][0]                        
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
____________________________________________________________________________________________________

According to this website https://github.com/fchollet/keras/issues/3190 , it says

Sequential(layers=functional_model.layers)

Could covert functional models into sequential model. However, if I do:

model = Sequential(layers=VGG_model.layers)
model.summary()

It leads to

Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_1[0][0]                    
                                                                   input_1[0][0]                    
                                                                   input_1[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
                                                                   block1_conv1[1][0]               
                                                                   block1_conv1[2][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
                                                                   block1_conv2[1][0]               
                                                                   block1_conv2[2][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 112, 112, 128) 73856       block1_pool[0][0]                
                                                                   block1_pool[1][0]                
                                                                   block1_pool[2][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 112, 112, 128) 147584      block2_conv1[0][0]               
                                                                   block2_conv1[1][0]               
                                                                   block2_conv1[2][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 56, 56, 128)   0           block2_conv2[0][0]               
                                                                   block2_conv2[1][0]               
                                                                   block2_conv2[2][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 56, 56, 256)   295168      block2_pool[0][0]                
                                                                   block2_pool[1][0]                
                                                                   block2_pool[2][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv1[0][0]               
                                                                   block3_conv1[1][0]               
                                                                   block3_conv1[2][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv2[0][0]               
                                                                   block3_conv2[1][0]               
                                                                   block3_conv2[2][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 28, 28, 256)   0           block3_conv3[0][0]               
                                                                   block3_conv3[1][0]               
                                                                   block3_conv3[2][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 28, 28, 512)   1180160     block3_pool[0][0]                
                                                                   block3_pool[1][0]                
                                                                   block3_pool[2][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv1[0][0]               
                                                                   block4_conv1[1][0]               
                                                                   block4_conv1[2][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv2[0][0]               
                                                                   block4_conv2[1][0]               
                                                                   block4_conv2[2][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 14, 14, 512)   0           block4_conv3[0][0]               
                                                                   block4_conv3[1][0]               
                                                                   block4_conv3[2][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 14, 14, 512)   2359808     block4_pool[0][0]                
                                                                   block4_pool[1][0]                
                                                                   block4_pool[2][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv1[0][0]               
                                                                   block5_conv1[1][0]               
                                                                   block5_conv1[2][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv2[0][0]               
                                                                   block5_conv2[1][0]               
                                                                   block5_conv2[2][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 7, 7, 512)     0           block5_conv3[0][0]               
                                                                   block5_conv3[1][0]               
                                                                   block5_conv3[2][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
                                                                   block5_pool[1][0]                
                                                                   block5_pool[2][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
                                                                   flatten[1][0]                    
                                                                   flatten[2][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
                                                                   fc1[1][0]                        
                                                                   fc1[2][0]                        
____________________________________________________________________________________________________
predictions (Dense)              (None, 1000)          4097000     fc2[0][0]                        
                                                                   fc2[1][0]                        
                                                                   fc2[2][0]                        
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_

This is different from the original functional model since the new layer is connected to the previous layer 3 times. People say it is more powerful to use functional models. But what I want to do is just to pop the final prediction layer. And functional model cannot do this...

回答1:

You can "pop" the final layer by just definining another Model taking the previous layer as output:

poppedModel = Model(VGG_model.input,VGG_model.layers[-2].output)

This model will share exactly the same weights as the original model, and training will affect both models.

You can add your own layers (or even models) after poppedModel, no problem:

popOut = poppedModel(input_tensor)
newLayOut = SomeKerasLayer(blablabla)(popOut)

anotherModel = Model(input_tensor, newLayOut)
#anotherModel will also share weights with poppedModel and VGG_model in the layers they have in common.

It's important, though, if you intend to train the new layers in anotherModel without affecting the VGG weights, that you make poppedModel.trainable = False and each layer in it with poppedModel.layers[i].trainable = False before compiling anotherModel.



回答2:

I have also been struggling with this and the previous poster was almost there, but left out a particular detail that was stumping me before. Indeed you can do the "pop" even with a Model created with the Functional API, but it is a little more work.

Here's my model (Just plain vanilla VGG16)

model.summary()
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_6 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_6[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 112, 112, 128) 73856       block1_pool[0][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 112, 112, 128) 147584      block2_conv1[0][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 56, 56, 128)   0           block2_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 56, 56, 256)   295168      block2_pool[0][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv1[0][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv2[0][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 28, 28, 256)   0           block3_conv3[0][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 28, 28, 512)   1180160     block3_pool[0][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv1[0][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv2[0][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 14, 14, 512)   0           block4_conv3[0][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 14, 14, 512)   2359808     block4_pool[0][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv1[0][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv2[0][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 7, 7, 512)     0           block5_conv3[0][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
____________________________________________________________________________________________________
predictions (Dense)              (None, 1000)          4097000     fc2[0][0]                        
====================================================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
____________________________________________________________________________________________________

Then I "popped" the last layer but not using pop, just with the Functional API

#Get the last but one layer/tensor from the old model
last_layer = model.layers[-2].output

#Define the new layer/tensor for the new model
new_model = Dense(2, activation='softmax', name='Binary_predictions')(last_layer)

#Create the new model, with the old models input and the new_model tensor as the output
new_model = Model(model.input, new_model, name='Finetuned_VGG16')

#Set all layers,except the last one to not trainable
for layer in new_model.layers[:-1]: layer.trainable=False

#Compile the new model
new_model.compile(optimizer=Adam(lr=learning_rate),
              loss='categorical_crossentropy', metrics=['accuracy'])

#now train with the new outputs (cats and dogs!)

This creates a new model (new_model) with the last layer replaced and the old layers fixed (made non-trainable)..

new_model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_6 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_6[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 112, 112, 128) 73856       block1_pool[0][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 112, 112, 128) 147584      block2_conv1[0][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 56, 56, 128)   0           block2_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 56, 56, 256)   295168      block2_pool[0][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv1[0][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv2[0][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 28, 28, 256)   0           block3_conv3[0][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 28, 28, 512)   1180160     block3_pool[0][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv1[0][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv2[0][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 14, 14, 512)   0           block4_conv3[0][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 14, 14, 512)   2359808     block4_pool[0][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv1[0][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv2[0][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 7, 7, 512)     0           block5_conv3[0][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
____________________________________________________________________________________________________
Binary_predictions (Dense)       (None, 2)             8194        fc2[0][0]                        
====================================================================================================
Total params: 134,268,738
Trainable params: 8,194
Non-trainable params: 134,260,544

The tricky part was getting the .output as the last layer, since that makes it a Tensor. Then using that Tensor as the input for the new Dense layer and making THAT the final output in the new Model...

Hope that helps...

Thon