Developing scala spark app that connect to azure C

2019-08-17 13:12发布

问题:

Im working on developing scala spark app that connect to cosmosDB and can't resolve dependency within SBT. Whenever I include org.apache.spark it conflict with azure-cosmosdb-spark and if I take out org.apache.spark I can't get spark sparkSession to resolve.

My SBT configurations :

name := "MyApp"
version := "1.0"``
scalaVersion := "2.11.8"

libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.3.0",
"org.apache.spark" % "spark-sql_2.11" % "2.3.0" ,
"org.apache.spark" % "spark-streaming_2.11" % "2.3.0" ,
"org.apache.spark" % "spark-mllib_2.11" % "2.3.0" ,
"com.microsoft.azure" % "azure-storage" % "2.0.0",
"org.apache.hadoop" % "hadoop-azure" % "2.7.3",
"com.microsoft.azure" % "azure-cosmosdb-spark_2.2.0_2.11" % "1.0.0",
"com.microsoft.azure" % "azure-documentdb" % "1.14.2" ,
"com.microsoft.azure" % "azure-documentdb-rx" % "0.9.0-rc2" ,
"io.reactivex" % "rxjava" % "1.3.0" ,
"io.reactivex" % "rxnetty" % "0.4.20",
 "org.json" % "json" % "20140107",
"org.jmockit" % "jmockit" % "1.34" % "test"
)

回答1:

You should use the exact same version of Spark as the azure-cosmosdb-spark library. Guessing from the version number they are using 2.2.0, while you are using 2.3.0. So you probably need to downgrade a version.

If you really need 2.3 you will need to look into shading, e.g. with the sbt-assembly plugin.