Resume training with different loss function

2019-07-23 16:21发布

问题:

I want to implement a two-step learning process where:
1) pre-train a model for a few epochs using the loss function loss_1
2) change the loss function to loss_2 and continue the training for fine-tuning

Currently, my approach is:

model.compile(optimizer=opt, loss=loss_1, metrics=['accuracy'])
model.fit_generator(…)
model.compile(optimizer=opt, loss=loss_2, metrics=['accuracy’])
model.fit_generator(…)

Note that the optimizer remains the same, and only the loss function changes. I'd like to smoothly continue training, but with a different loss function. According to this post, re-compiling the model loses the optimizer state. Questions:

a) Will I lose the optimizer state even if I use the same optimizer, eg Adam?
b) if the answer to a) is yes, any suggestions on how to change the loss function to a new one without reseting the optimizer state?

EDIT:
As suggested by Simon Caby and based on this thread, I created a custom loss function with two loss computations that depend on epoch number. However, it does not work for me. My approach:

def loss_wrapper(t_change, current_epoch):
    def custom_loss(y_true, y_pred):
       c_epoch = K.get_value(current_epoch)
       if c_epoch < t_change:
           # compute loss_1
       else:
           # compute loss_2
    return custom_loss

And I compile as follows, after initializing current_epoch:

current_epoch = K.variable(0.)
model.compile(optimizer=opt, loss=loss_wrapper(5, current_epoch), metrics=...)

To update the current_epoch, I create a callback:

class NewCallback(Callback):
    def __init__(self, current_epoch):
        self.current_epoch = current_epoch

    def on_epoch_end(self, epoch, logs={}):
        K.set_value(self.current_epoch, epoch)

model.fit_generator(..., callbacks=[NewCallback(current_epoch)])

The callback updates self.current_epoch every epoch correctly. But the update does not reach the custom loss function. Instead, current_epoch keeps the initialization value forever, and loss_2 is never executed.

Any suggestion is welcome, thanks!

回答1:

My answers : a) yes, and you should probably make your own learning rate scheduler in order to keep control of it :

keras.callbacks.LearningRateScheduler(schedule, verbose=0)

b) yes you can create your own loss function, including one that flutuates between two different loss methods. see : "Advanced Keras — Constructing Complex Custom Losses and Metrics" https://towardsdatascience.com/advanced-keras-constructing-complex-custom-losses-and-metrics-c07ca130a618



回答2:

If you change:

def loss_wrapper(t_change, current_epoch):
    def custom_loss(y_true, y_pred):
        c_epoch = K.get_value(current_epoch)
        if c_epoch < t_change:
            # compute loss_1
        else:
            # compute loss_2
    return custom_loss

to:

def loss_wrapper(t_change, current_epoch):
    def custom_loss(y_true, y_pred):
        # compute loss_1 and loss_2
        bool_case_1=K.less(current_epoch,t_change)
        num_case_1=K.cast(bool_case_1,"float32")
        loss = (num_case_1)*loss_1 + (1-num_case_1)*loss_2
        return loss
    return custom_loss

it works.

We are essentially required to turn python code into compositions of backend functions for the loss to work without having to update in a re-compile of model.compile(...). I am not satisfied with these hacks, and wish it was possible to set model.loss in a callback without re-compiling model.compile(...) after (since then the optimizer states are reset).