I tried to use H2O to create some machine learning models for binary classification problem, and the test results are pretty good. But then I checked and found something weird. I tried to print the prediction of the model for the test set out of curiosity. And I found out that my model actually predicts 0 (negative) all the time, but the AUC is around 0.65, and precision is not 0.0. Then I tried to use Scikit-learn just to compare the metrics scores, and (as expected) they’re different. The Scikit learn yielded 0.0 precision and 0.5 AUC score, which I think is correct. Here's the code that I used:
model = h2o.load_model(model_path)
predictions = model.predict(Test_data).as_data_frame()
# H2O version to print the AUC score
auc = model.model_performance(Test_data).auc()
# Python version to print the AUC score
auc_sklearn = sklearn.metrics.roc_auc_score(y_true, predictions['predict'].tolist())
Any thought? Thanks in advance!
There is no difference between H2O and scikit-learn scoring, you just need to understand how to make sense of the output so you can compare them accurately.
If you'll look at the data in
predictions['predict']
you'll see that it's a predicted class, not a raw predicted value. AUC uses the latter, so you'll need to use the correct column. See below:Examine the output:
Now compare to sklearn:
Here you see that they are approximately the same. AUC is an approximate method, so you'll see differences after a few decimal places when you compare different implementations.