I have written a piece of Machine learning code which is running perfectly on the Scala shell. I am in the process of compiling the code using SBT and creating a JAR. I used some codes from examples(in Spark e.g LocalLR and SparkPI) to try to compile the codes in a new project folder. They all compiled successfully but for some reasons my code is not compiling. I am following all the directory conventions but still no success.
import org.apache.spark.SparkContext
import org.apache.spark.mllib.evaluation._
import org.apache.spark.mllib.tree._
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.tree.model._
import org.apache.spark.rdd._
import org.apache.spark.mllib.util.MLUtils
import org.apache.spark.mllib.classification.LogisticRegressionModel
object PredictOOS {
def getMetrics(model: DecisionTreeModel, data: RDD[LabeledPoint]):
MulticlassMetrics = {
val predictionsAndLabels = data.map(example =>
(model.predict(example.features), example.label)
)
new MulticlassMetrics(predictionsAndLabels)
}
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Predict OOS")
val spark = new SparkContext(conf)
val data = spark.textFile("D:/data/g1-svm.csv")
val parsedData = data.map { line =>
val parts = line.split(',').map(_.toDouble)
LabeledPoint(parts(0), Vectors.dense(parts.tail))
}
val splits = parsedData.randomSplit(Array(0.8, 0.2), seed = 11L)
val training = splits(0).cache()
val test = splits(1)
val model = DecisionTree.trainClassifier(training, 2, Map[Int,Int] (), "gini", 20, 300)
val metrics = getMetrics(model, test)
println(" confusionMatrix is generated")
spark.stop()
}
}
Error given below
D:\ScalaApps\sparklr>cd ../oos
D:\ScalaApps\oos>sbt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; sup
port was removed in 8.0
[info] Set current project to Proj_oos (in build file:/D:/ScalaApps/oos/)
> compile
[info] Compiling 1 Scala source to D:\ScalaApps\oos\target\scala-2.11\classes...
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:5: not found: type MulticlassM
etrics
[error] MulticlassMetrics = {
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type DecisionTre
eModel
[error] def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:4: not found: type RDD
[error] def getMetrics(model: DecisionTreeModel, data: RDD[Label
edPoint]):
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:9: not found: type MulticlassM
etrics
[error] new MulticlassMetrics(predictionsAndLabels)
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value LabeledPo
int
[error] LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:19: not found: value Vectors
[error] LabeledPoint(parts(0), Vectors.dense(parts.tail)
)
[error] ^
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:25: not found: value DecisionT
ree
[error] val model = DecisionTree.trainClassifier(trainin
g, 2, Map[Int,Int](), "gini", 20, 300)
[error] ^
[error] 7 errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Dec 4, 2015 10:39:22 PM
>
Please suggest if I am missing anything. I am stuck at this compiling part from very long ..Any help would be greatly appreciated
This is an edit to the original post. The above code compiled successfully but it failed when I was writing the output to a file.
metrics.confusionMatrix.saveAsTextFile("D:/spark4/confMatrix2")
Error
D:\ScalaApps\oos>sbt
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=256m; sup
port was removed in 8.0
[info] Set current project to Proj_oos (in build file:/D:/ScalaApps/oos/)
> compile
[info] Compiling 1 Scala source to D:\ScalaApps\oos\target\scala-2.10\classes...
[error] D:\ScalaApps\oos\src\main\scala\oos.scala:44: value saveAsTextFile is no
t a member of org.apache.spark.mllib.linalg.Matrix
[error] metrics.confusionMatrix.saveAsTextFile("D:/spark
4/confMatrix2")
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 5 s, completed Dec 5, 2015 9:21:03 AM
>
Is there another package I need to import for saveAsTextFile to work ?