I'm using Datastax Enterprise 4.8.3. I'm trying to implement a Quartz based application to remotely submit Spark jobs. During my research I have stumbled upon the following links:
- Apache Spark Hidden REST API
- Spark feature - Provide a stable application submission gateway in standalone cluster mode
To test out the theory I tried executing the below code snippet on the master node (IP: "spark-master-ip"; directly on the shell) of my 2 node cluster (as provided in link #1 above):
curl -X POST http://spark-master-ip:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"action" : "CreateSubmissionRequest",
"appArgs" : [ "myAppArgument1" ],
"appResource" : "file:/home/local/sparkjob.jar",
"clientSparkVersion" : "1.4.2",
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass" : "com.spark.job.Launcher",
"sparkProperties" : {
"spark.jars" : "file:/home/local/sparkjob.jar",
"spark.driver.supervise" : "false",
"spark.app.name" : "MyJob",
"spark.eventLog.enabled": "true",
"spark.submit.deployMode" : "cluster",
"spark.master" : "spark://spark-master-ip:6066"
}
}'
But executing the code I get an html response with the following text:
This Page Cannot Be Displayed
The system cannot communicate with the external server (spark-master-ip).
The Internet server may be busy, may be permanently down, or may be unreachable because of network problems.
Please check the spelling of the Internet address entered.
If it is correct, try this request later.
If you have questions, please contact your organization's network administrator and provide the codes shown below.
Date: Fri, 11 Dec 2015 13:19:15 GMT
Username:
Source IP: spark-master-ip
URL: POST http://spark-master-ip/v1/submissions/create
Category: Uncategorized URLs
Reason: UNKNOWN
Notification: GATEWAY_TIMEOUT