Diffrence between extends from App and object cont

2020-04-17 06:00发布

问题:

I wrote sample spark-scala program for creating list of json elements from dataframe. when i executed with main method it returns empty list but when i executed without object that extends app it returns list that contains records. what is the difference between extends App and main method in scala object

object DfToMap {
def main(args: Array[String]): Unit = {
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._

val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")

val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)
}
}

It will return empty list But following program gives me list with records

object DfToMap extends App{
val spark: SparkSession = SparkSession.builder()
.appName("Rnd")
.master("local[*]")
.getOrCreate()
import spark.implicits._

val df = Seq(
(8, "bat"),
(64, "mouse"),
(27, "horse")
).toDF("number", "word")

val json = df.toJSON
val jsonArray = new util.ArrayList[String]()
json.foreach(f => jsonArray.add(f))
print(jsonArray)

}

回答1:

TL;DR Both snippets are not correct Spark programs, but one is just more incorrect than the other.

You've made two mistakes, both explained in the introductory Spark materials.

  • Due to it's nature Spark doesn't support applications extending App - Quick Start - Self-Contained Applications

    Note that applications should define a main() method instead of extending scala.App. Subclasses of scala.App may not work correctly.

  • Spark doesn't provide global shared memory therefore modifying global object is a closure is not supported - Spark Programming Guide - Understanding Closures