Why is the number of cores for driver and executor

2020-07-22 17:58发布

问题:

I deploy a spark job in the cluster mode with the following

Driver core - 1
Executor cores - 2
Number of executors - 2.

My understanding is that this application should occupy 5 cores in the cluster (4 executor cores and 1 driver core) but i dont observe this in the RM and Spark UIs.

  1. On the resource manager UI, i see only 4 cores used for this application.
  2. Even in Spark UI (on click of ApplicationMaster URL from RM), under the executors tab, the driver cores is shown as zero.

Am i missing something?

The cluster manager is YARN.

回答1:

My understanding is that this application should occupy 5 cores in the cluster (4 executor cores and 1 driver core)

That's the perfect situation in YARN where it could give you 5 cores off the CPUs it manages.

but i dont observe this in the RM and Spark UIs.

Since the perfect situation does not occur often something it's nice to have as many cores as we could get from YARN so the Spark application could ever start.

Spark could just wait indefinitely for the requested cores, but that could not always be to your liking, could it?

That's why Spark on YARN has an extra check (aka minRegisteredRatio) that's the minimum of 80% of cores requested before the application starts executing tasks. You can use spark.scheduler.minRegisteredResourcesRatio Spark property to control the ratio. That would explain why you see less cores in use than requested.

Quoting the official Spark documentation (highlighting mine):

spark.scheduler.minRegisteredResourcesRatio

0.8 for YARN mode

The minimum ratio of registered resources (registered resources / total expected resources) (resources are executors in yarn mode, CPU cores in standalone mode and Mesos coarsed-grained mode ['spark.cores.max' value is total expected resources for Mesos coarse-grained mode] ) to wait for before scheduling begins. Specified as a double between 0.0 and 1.0. Regardless of whether the minimum ratio of resources has been reached, the maximum amount of time it will wait before scheduling begins is controlled by config spark.scheduler.maxRegisteredResourcesWaitingTime.