I am trying to optimize the hyperparameters of my NN using Keras and sklearn. I am wrapping up with KerasClassifier (it´s a classification problem). I am trying to optimize the number of hidden layers. I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers) Could anyone please help me?
My code (just the important part):
## Import `Sequential` from `keras.models`
from keras.models import Sequential
# Import `Dense` from `keras.layers`
from keras.layers import Dense
def create_model(optimizer='adam', activation = 'sigmoid'):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
my_classifier = KerasClassifier(build_fn=create_model, verbose=0)# Create
hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
activation1 = ['relu','sigmoid']
# Create grid search
grid = RandomizedSearchCV(estimator=my_classifier,
param_distributions=hyperparameters) #inserir param_distributions
# Fit grid search
grid_result = grid.fit(X_train, y_train)
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs,
batch_size=batches, activation=activation1)
# View hyperparameters of best neural network
grid_result.best_params_
If you want to make the number of hidden layers a hyperparameter you have to add it as parameter to your
KerasClassifier
build_fn
like:Then you will be able to optimize the number of hidden layers by adding it to the dictionary, which is passed to
RandomizedSearchCV
'sparam_distributions
.One more thing, you probably should separate the
activation
you use for the output layer from the other layers. Different classes of activation functions are suitable for hidden layers and for output layers used in binary classification.