I am trying to run the Spark example code HBaseTest from command line using spark-submit instead run-example, in that case, I can learn more how to run spark code in general.
However, it told me CLASS_NOT_FOUND about htrace since I am using CDH5.4. I successfully located the htrace jar file but I am having a hard time adding it to path.
This is the final spark-submit command I have but still have the class not found error. Can anyone help me with this?
#!/bin/bash
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
/bin/bash $SPARK_HOME/bin/spark-submit \
--master yarn-client \
--class org.apache.spark.examples.HBaseTest \
--driver-class-path /etc/hbase/conf:$SPARK_HOME/examples/lib/*.jar:/opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/hbase/lib/*.jar \
--jars $SPARK_HOME/examples/lib/*.jar:/opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/hbase/lib/*.jar \
$SPARK_HOME/examples/lib/*.jar \
myhbasetablename
Note: htrace-core-3.0.4.jar, htrace-core-3.1.0-incubating.jar, htrace-core.jar are all located under '/opt/cloudera/parcels/CDH-5.4.2-1.cdh5.4.2.p0.2/lib/hbase/lib/'.
I opened up the
$SPARK_HOME/conf/classpath.txt
and just added the/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
to the end of the file.https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/ClassNotFoundException-org-apache-htrace-Trace-exception-in/m-p/29253/highlight/true#M915
This is because Spark is unable to find the HBase jars or classes. For Spark-HBase integration, the best way is to add HBase libraries to Spark Classpath.
This can be done using the
compute-classpath.sh
script in$SPARK_HOME/bin
folder.After this, restart Spark.
There you go :)