I have a cluster that I can launch successfully, at least that's what appears on web UI in which I see this information
URL: spark://Name25:7077
REST URL: spark://Name25:6066 (cluster mode)
Alive Workers: 10
Cores in use: 192 Total, 0 Used
Memory in use: 364.0 GB Total, 0.0 B Used
Applications: 0 Running, 5 Completed
Drivers: 0 Running, 5 Completed
Status: ALIVE
I used submit command to run my application, if I use it in this way
./bin/spark-submit --class myapp.Main --master spark://Name25:7077 --deploy-mode cluster /home/lookupjar/myapp-0.0.1-SNAPSHOT.jar /home/etud500.csv /home/
I get this message :
Running Spark using the REST application submission protocol. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/08/31 15:55:16 INFO RestSubmissionClient: Submitting a request to launch an application in spark://Name25:7077. 16/08/31 15:55:27 WARN RestSubmissionClient: Unable to connect to server spark://Name25:7077. Warning: Master endpoint spark://Name25:7077 was not a REST server. Falling back to legacy submission gateway instead. 16/08/31 15:55:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
and if I use it in this way :
./bin/spark-submit --class myapp.Main --master spark://Name25:6066 --deploy-mode cluster /home/lookupjar/myapp-0.0.1-SNAPSHOT.jar /home//etud500.csv /home/result
I get this message
Running Spark using the REST application submission protocol. Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 16/08/31 16:59:06 INFO RestSubmissionClient: Submitting a request to launch an application in spark://Name25:6066. 16/08/31 16:59:06 INFO RestSubmissionClient: Submission successfully created as driver-20160831165906-0004. Polling submission state... 16/08/31 16:59:06 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20160831165906-0004 in spark://Name25:6066. 16/08/31 16:59:06 INFO RestSubmissionClient: State of driver driver-20160831165906-0004 is now RUNNING. 16/08/31 16:59:06 INFO RestSubmissionClient: Driver is running on worker worker-20160831143117-10.0.10.48-38917 at 10.0.10.48:38917. 16/08/31 16:59:06 INFO RestSubmissionClient: Server responded with CreateSubmissionResponse: { "action" : "CreateSubmissionResponse", "message" : "Driver successfully submitted as driver-20160831165906-0004", "serverSparkVersion" : "2.0.0", "submissionId" : "driver-20160831165906-0004", "success" : true }
I think it's a success but my application should have 3 outputs to the given path (/home/result), because I used in my code :
path =args [1];
rdd1.saveAsTextFile(path+"/rdd1");
rdd2.saveAsTextFile(path+"/rdd2");
rdd3.saveAsTextFile(path+"/rdd3");
Question 1 : Why does it ask me to use "spark://Name25:6066 " rather than "spark://Name25:7077 "? because according to spark website we use :7077
Question 2 : If it indicates success of submitting and completed applications, why don't I find the 3 output folders ?