I want to evaluate a random forest being trained on some data. Is there any utility in Apache Spark to do the same or do I have to perform cross validation manually?
相关问题
- How to maintain order of key-value in DataFrame sa
- Spark on Yarn Container Failure
- In Spark Streaming how to process old data and del
- Filter from Cassandra table by RDD values
- Spark 2.1 cannot write Vector field on CSV
相关文章
- Livy Server: return a dataframe as JSON?
- SQL query Frequency Distribution matrix for produc
- How to filter rows for a specific aggregate with s
- How to name file when saveAsTextFile in spark?
- Use of randomforest() for classification in R?
- Spark save(write) parquet only one file
- Do I use the same Tfidf vocabulary in k-fold cross
- Could you give me any clue Why 'Cannot call me
To build on zero323's great answer using Random Forest Classifier, here is a similar example for Random Forest Regressor:
Evaluator metric source: https://spark.apache.org/docs/latest/api/scala/#org.apache.spark.ml.evaluation.RegressionEvaluator
ML provides
CrossValidator
class which can be used to perform cross-validation and parameter search. Assuming your data is already preprocessed you can add cross-validation as follows:Using PySpark: