How to extract best parameters from a CrossValidat

2020-02-07 17:18发布

I want to find the parameters of ParamGridBuilder that make the best model in CrossValidator in Spark 1.4.x,

In Pipeline Example in Spark documentation, they add different parameters (numFeatures, regParam) by using ParamGridBuilder in the Pipeline. Then by the following line of code they make the best model:

val cvModel = crossval.fit(training.toDF)

Now, I want to know what are the parameters (numFeatures, regParam) from ParamGridBuilder that produces the best model.

I already used the following commands without success:

cvModel.bestModel.extractParamMap().toString()
cvModel.params.toList.mkString("(", ",", ")")
cvModel.estimatorParamMaps.toString()
cvModel.explainParams()
cvModel.getEstimatorParamMaps.mkString("(", ",", ")")
cvModel.toString()

Any help?

Thanks in advance,

9条回答
贪生不怕死
2楼-- · 2020-02-07 17:30

this java code should work: cvModel.bestModel().parent().extractParamMap().you can translate it to scala code parent()method will return an estimator, you can get the best params then.

查看更多
太酷不给撩
3楼-- · 2020-02-07 17:33

One method to get a proper ParamMap object is to use CrossValidatorModel.avgMetrics: Array[Double] to find the argmax ParamMap:

implicit class BestParamMapCrossValidatorModel(cvModel: CrossValidatorModel) {
  def bestEstimatorParamMap: ParamMap = {
    cvModel.getEstimatorParamMaps
           .zip(cvModel.avgMetrics)
           .maxBy(_._2)
           ._1
  }
}

When run on the CrossValidatorModel trained in the Pipeline Example you cited gives:

scala> println(cvModel.bestEstimatorParamMap)
{
   hashingTF_2b0b8ccaeeec-numFeatures: 100,
   logreg_950a13184247-regParam: 0.1
}
查看更多
Rolldiameter
4楼-- · 2020-02-07 17:36

To print everything in paramMap, you actually don't have to call parent:

cvModel.bestModel().extractParamMap()

To answer OP's question, to get a single best parameter, for example regParam:

cvModel.bestModel().extractParamMap().apply(cvModel.bestModel.getParam("regParam"))
查看更多
淡お忘
5楼-- · 2020-02-07 17:39

This is how you get the chosen parameters

println(cvModel.bestModel.getMaxIter)   
println(cvModel.bestModel.getRegParam)  
查看更多
成全新的幸福
6楼-- · 2020-02-07 17:40

enter image description here

If java,see this debug show;

bestModel.parent().extractParamMap()
查看更多
SAY GOODBYE
7楼-- · 2020-02-07 17:41

Building in the solution of @macfeliga, a single liner that works for pipelines:

cvModel.bestModel.asInstanceOf[PipelineModel]
    .stages.foreach(stage => println(stage.extractParamMap))
查看更多
登录 后发表回答