I need to deploy a Spark Streaming application on Linux server.
Can anyone provide the steps to how to deploy and what code modification require before deploy?
class JavaKafkaWordCount11 {
public static void main(String[] args) {
SparkConf sparkConf = new SparkConf()
sparkConf.set("spark.streaming.concurrentJobs", "20");
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(1500));
Map<String, Integer> topicMap = new HashMap<>();
topicMap.put("TopicQueue", 20);
JavaPairReceiverInputDStream<String, String> messages =
KafkaUtils.createStream(jssc, "x.xx.xxx.xxx:2181", "1", topicMap);
JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
lines.foreachRDD(rdd -> {
if (rdd.count() > 0) {
List<String> strArray = rdd.collect();
Here are the steps:
Yes, there's only one step required that boils down to:
sbt package
which assumes you usesbt
which for Java might begradle
, though. That just says you have to package your Spark application so it's ready for deployment.spark-submit
your packaged Spark application.You may optionally start your cluster (e.g. Spark Standalone, Apache Mesos or Hadoop YARN), but it's not really needed since
by default.p.s. You're using Apache Kafka so you have to have it up and running (at
), too.You can submit your job through Spark-submit.like this..,
for any reference follow this link: