Spark Kafka Streaming Issue

2020-02-26 12:59发布

I am using maven

i have added the following dependencies

   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>

I have also added the jar in the code

SparkConf sparkConf = new SparkConf().setAppName("KafkaSparkTest");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.addJar("/home/test/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.0.2/spark-streaming-kafka_2.10-1.0.2.jar");
JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000)); 

It comples fine with out any error , i am getting the following error when i run through spark-submit, any help is much appreciated. Thanks for your time.

bin/spark-submit --class "KafkaSparkStreaming" --master local[4] try/simple-project/target/simple-project-1.0.jar

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils at KafkaSparkStreaming.sparkStreamingTest(KafkaSparkStreaming.java:40) at KafkaSparkStreaming.main(KafkaSparkStreaming.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

2条回答
够拽才男人
2楼-- · 2020-02-26 13:28

I meet the same problem, I solved it by build the jar with dependencies.

  1. remove "sc.addJar()" in your code.

  2. add the code below to pom.xml

    <build>
        <sourceDirectory>src/main/java</sourceDirectory>
        <testSourceDirectory>src/test/java</testSourceDirectory>
        <plugins>
          <!--
                       Bind the maven-assembly-plugin to the package phase
            this will create a jar file without the storm dependencies
            suitable for deployment to a cluster.
           -->
          <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
              <descriptorRefs>
                <descriptorRef>jar-with-dependencies</descriptorRef>
              </descriptorRefs>
              <archive>
                <manifest>
                  <mainClass></mainClass>
                </manifest>
              </archive>
            </configuration>
            <executions>
              <execution>
                <id>make-assembly</id>
                <phase>package</phase>
                <goals>
                  <goal>single</goal>
                </goals>
              </execution>
            </executions>
          </plugin>
        </plugins>
    </build>    
    
  3. mvn package

  4. submit the "example-jar-with-dependencies.jar"

查看更多
爱情/是我丢掉的垃圾
3楼-- · 2020-02-26 13:29

For future reference, if you get a ClassNotFoundException, if you search for "org.apache.spark..." you will be taken to the maven page where it will tell you the dependency you are missing in your pom file. It will also give you the code to put in your pom file.

查看更多
登录 后发表回答