Remote Spark Connection - Scala: NullPointerExcept

2019-09-04 00:24发布

问题:

Spark Master and Worker, both are running in localhost. I have started Master and Worker node by triggering command:

sbin/start-all.sh

Logs for Master node invocation:

Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 186590dbe5bd.ant.abc.com --port 7077 --webui-port 8080

Logs for Worker node invocation:

Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://186590dbe5bd.ant.abc.com:7077

I have following configuration in conf/spark-env.sh

SPARK_MASTER_HOST=186590dbe5bd.ant.abc.com

Content of /etc/hosts:

127.0.0.1   localhost
255.255.255.255 broadcasthost
::1             localhost
127.0.0.1       186590dbe5bd.ant.abc.com

Scala code, that I am running to establish remote spark connection:

val sparkConf = new SparkConf()
  .setAppName(AppConstants.AppName)
  .setMaster("spark://186590dbe5bd.ant.abc.com:7077")
val sparkSession = SparkSession.builder()
  .appName(AppConstants.AppName)
  .config(sparkConf)
  .enableHiveSupport()
  .getOrCreate()

While executing code from IDE, I am getting following exception in console:

2018-10-04 18:58:38,488 INFO  [main] storage.BlockManagerMaster (Logging.scala:logInfo(54)) - Registering BlockManager BlockManagerId(driver, 192.168.0.38, 56083, None)
2018-10-04 18:58:38,491 ERROR [main] spark.SparkContext (Logging.scala:logError(91)) - Error initializing SparkContext.
java.lang.NullPointerException
    at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
    at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
    ………
018-10-04 18:58:38,496 INFO  [main] spark.SparkContext (Logging.scala:logInfo(54)) - SparkContext already stopped.
2018-10-04 18:58:38,492 INFO  [dispatcher-event-loop-3] scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
Exception in thread "main" java.lang.NullPointerException
    at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
    at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
    ………

Logs from /logs/master shows following error:

18/10/04 18:58:18 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local class incompatible: stream classdesc serialVersionUID = 1835832137613908542, local class serialVersionUID = -1329125091869941550
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
    …………

Spark Versions:

Spark: spark-2.3.1-bin-hadoop2.7

Build dependencies:

Scala: 2.11
Spark-hive: 2.2.2
Maven-org-spark-project-hive hive-metastore = 1.x;

What changes should be done to successfully connect spark remotely? Thanks

Complete Log: Console.log Spark Master-Node.log