I am building an application that uses Spark for Random Forest based classification.
When trying to run this program I am getting an exception from the line:
StringIndexerModel labelIndexer = new StringIndexer().setInputCol("label").setOutputCol("indexedLabel").fit(data);
It looks like that the code somehow reaches Janino version 2.7.8, although I understand I need 3.0.7. I have no idea how to set the dependencies correctly in order to force the build to use the correct version. It seems that it always tries to use 2.7.8.
Is it possible that somehow I ned to clean the cache?
Here is the line from gradle dependencies
:
+--- org.codehaus.janino:janino:3.0.7 -> 2.7.8
| +--- org.codehaus.janino:commons-compiler:3.0.7
The Gradle section defining the dependencies:
dependencies {
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2') { force = true }
compile('org.apache.hadoop:hadoop-common:2.7.2') { force = true }
// https://mvnrepository.com/artifact/org.codehaus.janino/janino
compile (group: 'org.codehaus.janino', name: 'janino', version: '3.0.7') {
force = true
exclude group: 'org.codehaus.janino', module: 'commons-compiler'
}
// https://mvnrepository.com/artifact/org.codehaus.janino/commons-compiler
compile (group: 'org.codehaus.janino', name: 'commons-compiler', version: '3.0.7') {
force = true
exclude group: 'org.codehaus.janino', module: 'janino'
}
// https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11
compile (group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.2.0') {
exclude group: 'org.codehaus.janino', module: 'janino'
exclude group: 'org.codehaus.janino', module: 'commons-compiler'
}
// https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11
compile (group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.2.0') {
exclude group: 'org.codehaus.janino', module: 'janino'
exclude group: 'org.codehaus.janino', module: 'commons-compiler'
}
// https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.11
compile (group: 'org.apache.spark', name: 'spark-mllib_2.11', version: '2.2.0') {
exclude group: 'org.codehaus.janino', module: 'janino'
exclude group: 'org.codehaus.janino', module: 'commons-compiler'
}
// https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind
runtime group: 'com.fasterxml.jackson.core', name: 'jackson-databind', version: '2.6.5'
// https://mvnrepository.com/artifact/com.fasterxml.jackson.module/jackson-module-scala_2.11
runtime group: 'com.fasterxml.jackson.module', name: 'jackson-module-scala_2.11', version: '2.6.5'
compile group: 'com.google.code.gson', name: 'gson', version: '2.8.1'
compile group: 'org.apache.logging.log4j', name: 'log4j-api', version: '2.4.1'
compile group: 'org.apache.logging.log4j', name: 'log4j-core', version: '2.4.1'
testCompile 'org.testng:testng:6.9.4'
testCompile 'org.mockito:mockito-core:1.10.19'
}
The exception:
Exception in thread "main" java.lang.NoSuchMethodError: org.codehaus.commons.compiler.Location.<init>(Ljava/lang/String;SS)V
at org.codehaus.janino.Scanner.location(Scanner.java:261)
at org.codehaus.janino.Parser.location(Parser.java:2742)
at org.codehaus.janino.Parser.parseImportDeclarationBody(Parser.java:209)
at org.codehaus.janino.ClassBodyEvaluator.makeCompilationUnit(ClassBodyEvaluator.java:255)
at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:222)
at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:192)
at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:80)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:960)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1027)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$$anon$1.load(CodeGenerator.scala:1024)
at org.spark_project.guava.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3599)
at org.spark_project.guava.cache.LocalCache$Segment.loadSync(LocalCache.java:2379)
at org.spark_project.guava.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2257)
at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:906)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:375)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at org.apache.spark.sql.execution.DeserializeToObjectExec.doExecute(objects.scala:95)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
at org.apache.spark.sql.Dataset.rdd$lzycompute(Dataset.scala:2581)
at org.apache.spark.sql.Dataset.rdd(Dataset.scala:2578)
at org.apache.spark.ml.feature.StringIndexer.fit(StringIndexer.scala:111)