Can I send callbacks to a KerasClassifier?

2020-03-10 04:59发布

I want the classifier to run faster and stop early if the patience reaches the number I set. In the following code it does 10 iterations of fitting the model.

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.wrappers.scikit_learn import KerasClassifier
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.constraints import maxnorm
from keras.optimizers import SGD
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("sonar.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)

calls=[EarlyStopping(monitor='acc', patience=10), ModelCheckpoint('C:/Users/Nick/Data Science/model', monitor='acc', save_best_only=True, mode='auto', period=1)]

def create_baseline(): 
    # create model
    model = Sequential()
    model.add(Dropout(0.2, input_shape=(33,)))
    model.add(Dense(33, init='normal', activation='relu', W_constraint=maxnorm(3)))
    model.add(Dense(16, init='normal', activation='relu', W_constraint=maxnorm(3)))
    model.add(Dense(122, init='normal', activation='softmax'))
    # Compile model
    sgd = SGD(lr=0.1, momentum=0.8, decay=0.0, nesterov=False)
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    return model

numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Here is the resulting error-

RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000000001D691438>, as the constructor does not seem to set parameter callbacks

I changed the cross_val_score in the following-

numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'callbacks':calls})
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

and now I get this error-

ValueError: need more than 1 value to unpack

This code came from here. The code is by far the most accurate I've used so far. The problem is that there is no defined model.fit() anywhere in the code. It also takes forever to fit. The fit() operation occurs at the results = cross_val_score(...) and there's no parameters to throw a callback in there.

How do I go about doing this? Also, how do I run the model trained on a test set?

I need to be able to save the trained model for later use...

3条回答
兄弟一词,经得起流年.
2楼-- · 2020-03-10 05:54

This is what I have done

results = cross_val_score(estimator, X, Y, cv=kfold,
                      fit_params = {'callbacks': [checkpointer,plateau]})

and has worked so far

查看更多
Anthone
3楼-- · 2020-03-10 05:55

Try:

estimators.append(('mlp', 
                   KerasClassifier(build_fn=create_model2,
                                   nb_epoch=300,
                                   batch_size=16,
                                   verbose=0,
                                   callbacks=[list_of_callbacks])))

where list_of_callbacks is a list of callbacks you want to apply. You could find details here. It's mentioned there that parameters fed to KerasClassifier could be legal fitting parameters.

It's also worth to mention that if you are using multiple runs with GPUs there might be a problem due to several reported memory leaks especially when you are using theano. I also noticed that running multiple fits consequently may show results which seem to be not independent when using sklearn API.

Edit:

Try also:

results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params = {'mlp__callbacks': calls})

Instead of putting callbacks list in a wrapper instantiation.

查看更多
Root(大扎)
4楼-- · 2020-03-10 05:57

Reading from here, which is the source code of KerasClassifier, you can pass it the arguments of fit and they should be used. I don't have your dataset so I cannot test it, but you can tell me if this works and if not I will try and adapt the solution. Change this line :

estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=[...your_callbacks...])))

A small explaination of what's happening : KerasClassifier is taking all the possibles arguments for fit, predict, score and uses them accordingly when each method is called. They made a function that filters the arguments that should go to each of the above functions that can be called in the pipeline. I guess there are several fit and predict calls inside the StratifiedKFold step to train on different splits everytime.

The reason why it takes forever to fit and it fits 10 times is because one fit is doing 300 epochs, as you asked. So the KFold is repeating this step over the different folds :

  • calls fit with all the parameters given to KerasClassifier (300 epochs and batch size = 16). It's training on 9/10 of your data and using 1/10 as validation.

EDIT :

Ok, so I took the time to download the dataset and try your code... First of all you need to correct a "few" things in your network :

  • your input have a 60 features. You clearly show it in your data prep :

    X = dataset[:,:60].astype(float)
    

    so why would you have this :

    model.add(Dropout(0.2, input_shape=(33,)))
    

    please change to :

    model.add(Dropout(0.2, input_shape=(60,)))
    
  • About your targets/labels. You changed the objective from the original code (binary_crossentropy) to categorical_crossentropy. But you didn't change your Y array. So either do this in your data preparation :

    from keras.utils.np_utils import to_categorical
    encoded_Y = to_categorical(encoder.transform(Y))
    

    or change your objective back to binary_crossentropy.

  • Now the network's output size : 122 on the last dense layer? your dataset obviously has 2 categories so why are you trying to output 122 classes? it won't match the target. Please change back your last layer to :

    model.add(Dense(2, init='normal', activation='softmax'))
    

    if you choose to use categorical_crossentropy, or

    model.add(Dense(1, init='normal', activation='sigmoid'))
    

    if you go back to binary_crossentropy.

So now that your network compiles, I could start to troubleshout.

here is your solution

So now I could get the real error message. It turns out that when you feed fit_params=whatever in the cross_val_score() function, you are feeding those parameters to a pipeline. In order to know to which part of the pipeline you want to send those parameters you have to specify it like this :

fit_params={'mlp__callbacks':calls}

Your error was saying that the process couldn't unpack 'callbacks'.split('__', 1) into 2 values. It was actually looking for the name of the pipeline's step to apply this to.

It should be working now :)

results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'mlp__callbacks':calls})

BUT, you should be aware of what's happening here... the cross validation actually calls the create_baseline() function to recreate the model from scratch 10 times an trains it 10 times on different parts of the dataset. So it's not doing epochs as you were saying, it's doing 300 epochs 10 times. What is also happening as a consequence of using this tool : since the models are always differents, it means the fit() method is applied 10 times on different models, therefore, the callbacks are also applied 10 different times and the files saved by ModelCheckpoint() get overriden and you find yourself only with the best model of the last run.

This is intrinsec to the tools you use, I don't see any way around this. This comes as consequence to using different general tools that weren't especially thought to be used together with all the possible configurations.

查看更多
登录 后发表回答