Empty list returned when accessing automl leader v

2019-07-21 16:26发布

Running h2o.automl() returns a single model in leaderboard; however, when trying to access the actual model via @leader@model, the following error ensues:

Error in is.H2OFrame(x) : trying to get slot "metrics" from an object of a basic class ("NULL") with no slots

As well, when calling h2o.predict() on the leader model, got the error message:

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, : ERROR MESSAGE: Object 'dummy' not found in function: predict for argument: model

Model was run in the same session using h2o v3.20.0.2 in R.

标签: r h2o automl
1条回答
Lonely孤独者°
2楼-- · 2019-07-21 17:00

I think what's happening is that you're not able to train a single model in one hour, so when you try to collect the leader model, it's trying to grab an incomplete model and you get an error. You don't have very many rows, but you have a really large number of columns.

Since it's hard to predict how long the model training will take, I'd use the max_models argument instead of limiting by time. Since AutoML will stop when it reaches the first of max_models or max_runtime_secs, I'd set max_runtime_secs to a very large number (e.g. 999999999) and then set max_models = 10 or whatever number you like.

Second, since you have very wide data, I'd recommend turning off the Random Forests and GBM models, and leaving the GLM and Deep Learning models. To do that, set exclude_algos = c("DRF", "GBM"). It will take a really long time to train tree-based models on 120k columns.

Another good option to consider is to first apply PCA or GLRM to your data to reduce the dimensionality to <500 columns and then you can include the tree-based models in the AutoML run.

查看更多
登录 后发表回答