Spark Master and Worker, both are running in localhost. I have started Master and Worker node by triggering command:
sbin/start-all.sh
Logs for Master node invocation:
Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 186590dbe5bd.ant.abc.com --port 7077 --webui-port 8080
Logs for Worker node invocation:
Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://186590dbe5bd.ant.abc.com:7077
I have following configuration in conf/spark-env.sh
SPARK_MASTER_HOST=186590dbe5bd.ant.abc.com
Content of /etc/hosts:
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
127.0.0.1 186590dbe5bd.ant.abc.com
Scala code, that I am running to establish remote spark connection:
val sparkConf = new SparkConf()
.setAppName(AppConstants.AppName)
.setMaster("spark://186590dbe5bd.ant.abc.com:7077")
val sparkSession = SparkSession.builder()
.appName(AppConstants.AppName)
.config(sparkConf)
.enableHiveSupport()
.getOrCreate()
While executing code from IDE, I am getting following exception in console:
2018-10-04 18:58:38,488 INFO [main] storage.BlockManagerMaster (Logging.scala:logInfo(54)) - Registering BlockManager BlockManagerId(driver, 192.168.0.38, 56083, None)
2018-10-04 18:58:38,491 ERROR [main] spark.SparkContext (Logging.scala:logError(91)) - Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
………
018-10-04 18:58:38,496 INFO [main] spark.SparkContext (Logging.scala:logInfo(54)) - SparkContext already stopped.
2018-10-04 18:58:38,492 INFO [dispatcher-event-loop-3] scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
Exception in thread "main" java.lang.NullPointerException
at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
………
Logs from /logs/master shows following error:
18/10/04 18:58:18 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local class incompatible: stream classdesc serialVersionUID = 1835832137613908542, local class serialVersionUID = -1329125091869941550
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
…………
Spark Versions:
Spark: spark-2.3.1-bin-hadoop2.7
Build dependencies:
Scala: 2.11
Spark-hive: 2.2.2
Maven-org-spark-project-hive hive-metastore = 1.x;
What changes should be done to successfully connect spark remotely? Thanks
Complete Log: Console.log Spark Master-Node.log