I'm running grid search on random forests and trying to use n_jobs different than one but the kernel freezes, there is no CPU usage. With n_jobs=1 it works fine. I can't even stop the command with ctl-C and have to restart the kernel. I'm running on windows 7. I saw that there is a similar problem with OS X but the solution is not relevant for windows 7.
from sklearn.ensemble import RandomForestClassifier
rf_tfdidf = Pipeline([('vect',tfidf),
('clf', RandomForestClassifier(n_estimators=50,
class_weight='balanced_subsample'))])
param_grid = [{'vect__ngram_range':[(1,1)],
'vect__stop_words': [stop],
'vect__tokenizer':[tokenizer]
}]
if __name__ == '__main__':
gs_rf_tfidf = GridSearchCV(rf_tfdidf, param_grid, scoring='accuracy', cv=5,
verbose=10,
n_jobs=2)
gs_rf_tfidf.fit(X_train_part, y_train_part)
thanks.