Number of Executors in Spark Local Mode

2020-03-02 03:34发布

问题:

So I am running a spark job in local mode. I use the following command to run the job

spark-submit --master local[*] --driver-memory 256g --class main.scala.mainClass target/scala-2.10/spark_proj-assembly-1.0.jar 0 large.csv 100 outputFolder2 10

I am running this on a machine with 32 Cores and 256GB RAM. When creating the conf i use the following code

val conf = new SparkConf().setMaster("local[*]").setAppName("My App")

Now I now in local mode, Spark runs everything inside a single JVM, but does that mean it launches only one driver and use it as executor as well. In my time line it shows one executor driver added. And when I go the the Executors page, there is just one executor with 32 cores assigned to it

Is this the default behavior ? I was expecting spark would launch one executor per core instead of just one executor that gets all the core. If some one can explain the behavior, that would be great

回答1:

Is this the default behavior?

In local mode, your driver + executors are, as you've said, created inside a single JVM process. What you see isn't an executor, it is a view of how many cores your job has at its disposable. Usually when running under local mode, you should only be seeing the driver in the executors view.

If you look at the code for LocalSchedulerBackend, you'll see the following comment:

/**
 * Used when running a local version of Spark where the executor, backend, and master all run in
 * the same JVM. It sits behind a [[TaskSchedulerImpl]] and handles launching tasks on a single
 * Executor (created by the [[LocalSchedulerBackend]]) running locally.

We have a single, in the same JVM instance executor which handles all tasks.