I'm setting up a Learning Rate Scheduler in Keras, using history loss as an updater to self.model.optimizer.lr, but the value on self.model.optimizer.lr does not get inserted in the SGD optimizer and the optimizer is using the dafault learning rate. The code is:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.optimizers import SGD
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.preprocessing import StandardScaler
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
self.model.optimizer.lr=3
def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
self.model.optimizer.lr=lr-10000*self.losses[-1]
def base_model():
model=Sequential()
model.add(Dense(4, input_dim=2, init='uniform'))
model.add(Dense(1, init='uniform'))
sgd = SGD(decay=2e-5, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error',optimizer=sgd,metrics['mean_absolute_error'])
return model
history=LossHistory()
estimator = KerasRegressor(build_fn=base_model,nb_epoch=10,batch_size=16,verbose=2,callbacks=[history])
estimator.fit(X_train,y_train,callbacks=[history])
res = estimator.predict(X_test)
Everything works fine using Keras as a regressor for continuous variables, But I want to reach a smaller derivative by updating the optimizer learning rate.
Thanks, I found an alternative solution, as I'm not using GPU:
The output is (lr is learning rate):
And this is what happens to Learning Rate over the epochs:
The learning rate is a variable on the computing device, e.g. a GPU if you are using GPU computation. That means that you have to use
K.set_value
, withK
beingkeras.backend
. For example:or in your example
In new Keras API you can use more general version of
schedule
function which takes two argumentsepoch
andlr
.From docs:
From sources:
So your function could be: