How can I get the current SparkSession in any plac

2020-02-22 04:31发布

问题:

I have created a session in the main() function, like this:

val sparkSession = SparkSession.builder.master("local[*]").appName("Simple Application").getOrCreate()

Now if I want to configure the application or access the properties, I can use the local variable sparkSession in the same function.

What if I want to access this sparkSession elsewhere in the same project, like project/module/.../.../xxx.scala. What should I do?

回答1:

Once a session was created (anywhere), you can safely use:

SparkSession.builder().getOrCreate()

To get the (same) session anywhere in the code, as long as the session is still alive. Spark maintains a single active session so unless it was stopped or crashed, you'll get the same one.



回答2:

Since 2.2.0 you can access the active SparkSession through:

/**
 * Returns the active SparkSession for the current thread, returned by the builder.
 *
 * @since 2.2.0
 */
def getActiveSession: Option[SparkSession] = Option(activeThreadSession.get)

or default SparkSession:

/**
 * Returns the default SparkSession that is returned by the builder.
 *
 * @since 2.2.0
 */
def getDefaultSparkSession: Option[SparkSession] = Option(defaultSession.get)


回答3:

When SparkSession variable has been defined as

val sparkSession = SparkSession.builder.master("local[*]").appName("Simple Application").getOrCreate()

This variable is going to point/refer to only one SparkSession as its a val. And you can always pass to different classes for them to access as well as

val newClassCall = new NewClass(sparkSession)

Now you can use the same sparkSession in that new class as well.