Implementing custom stopping metrics to optimize d

2020-02-29 01:30发布

问题:

I'm trying to implement the FBeta_Score() of the MLmetrics R package:

FBeta_Score <- function(y_true, y_pred, positive = NULL, beta = 1) {
   Confusion_DF <- ConfusionDF(y_pred, y_true)
   if (is.null(positive) == TRUE) 
   positive <- as.character(Confusion_DF[1,1])
   Precision <- Precision(y_true, y_pred, positive)
   Recall <- Recall(y_true, y_pred, positive)
   Fbeta_Score <- (1 + beta^2) * (Precision * Recall) / (beta^2 * Precision + 
   Recall)
   return(Fbeta_Score)
 }

in the H2O distributed random forest model and I want to optimize it during the training phase using the custom_metric_func option. The help documentation of the h2o.randomForest() function says:

Reference to custom evaluation function, format: 'language:keyName=funcName'

But I don't understand how to use it directly from R and what I should specify in the stopping_metric option.

Any help would be appreciated!

回答1:

Currently there is only backend support for Python-based custom functions, which can be uploaded to the backend via the h2o.upload_custom_metric() function. This function will then return a function reference (this is a string that has a naming convention format of 'language:keyName=funcName'). That you can then pass to the custom_metric parameter.

For example:

custom_mm_func = h2o.upload_custom_metric(CustomRmseFunc, func_name="rmse", func_file="mm_rmse.py")

returns a function reference which has the following value:

> print(custom_mm_func)
python:rmse=mm_rmse.CustomRmseFuncWrapper

As for your second question about using the custom metric as a stopping metric, there is a jira ticket that you can follow here: https://0xdata.atlassian.net/browse/PUBDEV-5261

You can find more details on how to use the custom metric here.