Scala code not compiling in SBT

2019-09-05 07:30发布

问题:

I have written a piece of Machine learning code which is running perfectly on the Scala shell. I am in the process of compiling the code using SBT and creating a JAR. I used some codes from examples(in Spark e.g LocalLR and SparkPI) to try to compile the codes in a new project folder. They all compiled successfully but for some reasons my code is not compiling. I am following all the directory conventions but still no success.

    import org.apache.spark.SparkContext
    import org.apache.spark.mllib.evaluation._
    import org.apache.spark.mllib.tree._
    import org.apache.spark.mllib.regression.LabeledPoint
    import org.apache.spark.mllib.linalg.Vectors
    import org.apache.spark.mllib.tree.model._
    import org.apache.spark.rdd._
    import org.apache.spark.mllib.util.MLUtils
    import org.apache.spark.mllib.classification.LogisticRegressionModel


    object PredictOOS {
    def getMetrics(model: DecisionTreeModel, data: RDD[LabeledPoint]):
        MulticlassMetrics = {
      val predictionsAndLabels = data.map(example =>
        (model.predict(example.features), example.label)
      )
      new MulticlassMetrics(predictionsAndLabels)
    }

    def main(args: Array[String]) {
        val conf = new SparkConf().setAppName("Predict OOS")
        val spark = new SparkContext(conf)

        val data = spark.textFile("D:/data/g1-svm.csv")
        val parsedData = data.map { line =>
        val parts = line.split(',').map(_.toDouble)
        LabeledPoint(parts(0), Vectors.dense(parts.tail))
        }
        val splits = parsedData.randomSplit(Array(0.8, 0.2), seed = 11L)
        val training = splits(0).cache()
        val test = splits(1)

        val model = DecisionTree.trainClassifier(training, 2, Map[Int,Int]    (), "gini", 20, 300)

        val metrics = getMetrics(model, test)

        println(" confusionMatrix is generated")
        spark.stop()
  }
}

Error given below

D:\ScalaApps\sparklr>cd ../oos

D:\ScalaApps\oos>sbt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; sup
port was removed in 8.0
[info] Set current project to Proj_oos (in build file:/D:/ScalaApps/oos/)
> compile
[info] Compiling 1 Scala source to D:\ScalaApps\oos\target\scala-2.11\classes...

[error] D:\ScalaApps\oos\src\main\scala\oos.scala:5: not found: type MulticlassM
etrics
[error]                     MulticlassMetrics = {
[error]                     ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type DecisionTre
eModel
[error]                 def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error]                                       ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type RDD
[error]                 def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error]                                                                ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:9: not found: type MulticlassM
etrics
[error]                   new MulticlassMetrics(predictionsAndLabels)
[error]                       ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value LabeledPo
int
[error]                         LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error]                         ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value Vectors
[error]                         LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error]                                                ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:25: not found: value DecisionT
ree
[error]                         val model = DecisionTree.trainClassifier(trainin
g, 2, Map[Int,Int](), "gini", 20, 300)
[error]                                     ^
[error] 7 errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Dec 4, 2015 10:39:22 PM
>

Please suggest if I am missing anything. I am stuck at this compiling part from very long ..Any help would be greatly appreciated

This is an edit to the original post. The above code compiled successfully but it failed when I was writing the output to a file.

    metrics.confusionMatrix.saveAsTextFile("D:/spark4/confMatrix2")

Error

D:\ScalaApps\oos>sbt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; sup
port was removed in 8.0
[info] Set current project to Proj_oos (in build file:/D:/ScalaApps/oos/)
> compile
[info] Compiling 1 Scala source to D:\ScalaApps\oos\target\scala-2.10\classes...

[error] D:\ScalaApps\oos\src\main\scala\oos.scala:44: value saveAsTextFile is no
t a member of org.apache.spark.mllib.linalg.Matrix
[error]                         metrics.confusionMatrix.saveAsTextFile("D:/spark
4/confMatrix2")
[error]                                                 ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Dec 5, 2015 9:21:03 AM
>

Is there another package I need to import for saveAsTextFile to work ?

回答1:

You should add to your build.sbt the following dependency:

libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.4.0"

And in your scala-file add the following import:

import org.apache.spark.{SparkConf, SparkContext}

Hope this helps



回答2:

I have resolved this. Thanks for the time.

    metrics.confusionMatrix.saveAsTextFile("D:/spark4/confMatrix2")

dosen't works even on the console. Instead I had to do the following to save the results.

        val res = metrics.confusionMatrix.toArray
        val res1 = spark.parallelize(res)
        res1.coalesce(1).saveAsTextFile("D:/spark4/confmatrix2")