I'm running GridSearch CV to optimize the parameters of a classifier in scikit. Once I'm done, I'd like to know which parameters were chosen as the best.
Whenever I do so I get a AttributeError: 'RandomForestClassifier' object has no attribute 'best_estimator_'
, and can't tell why, as it seems to be a legitimate attribute on the documentation.
from sklearn.grid_search import GridSearchCV
X = data[usable_columns]
y = data[target]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
rfc = RandomForestClassifier(n_jobs=-1,max_features= 'sqrt' ,n_estimators=50, oob_score = True)
param_grid = {
'n_estimators': [200, 700],
'max_features': ['auto', 'sqrt', 'log2']
}
CV_rfc = GridSearchCV(estimator=rfc, param_grid=param_grid, cv= 5)
print '\n',CV_rfc.best_estimator_
Yields:
`AttributeError: 'GridSearchCV' object has no attribute 'best_estimator_'
You have to fit your data before you can get the best parameter combination.
Just to add one more point to keep it clear.
The document says the following:
When the grid search is called with various params, it chooses the one with the highest score based on the given scorer func. Best estimator gives the info of the params that resulted in the highest score.
Therefore, this can only be called after fitting the data.