I'm using python and I would like to use nested cross-validation with scikit learn. I have found a very good example:
NUM_TRIALS = 30
non_nested_scores = np.zeros(NUM_TRIALS)
nested_scores = np.zeros(NUM_TRIALS)
# Choose cross-validation techniques for the inner and outer loops,
# independently of the dataset.
# E.g "LabelKFold", "LeaveOneOut", "LeaveOneLabelOut", etc.
inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)
outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)
# Non_nested parameter search and scoring
clf = GridSearchCV(estimator=svr, param_grid=p_grid, cv=inner_cv)
clf.fit(X_iris, y_iris)
non_nested_scores[i] = clf.best_score_
# Nested CV with parameter optimization
nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)
nested_scores[i] = nested_score.mean()
How can the best set of parameters as well as all set of parameters (with their corresponding score) from the nested cross-validation be accessed?
You cannot access individual params and best params from
cross_val_score
. Whatcross_val_score
does internally is clone the supplied estimator and then callfit
andscore
methods on it with givenX
,y
on individual estimators.If you want to access the params at each split you can use: