Spark configuration, what is the difference of SPA

2019-03-14 01:51发布

I did my work, read the documentation at https://spark.apache.org/docs/latest/configuration.html

in spark-folder/conf/spark-env.sh:

SPARK_DRIVER_MEMORY, Memory for Master (e.g. 1000M, 2G) (Default: 512 Mb)
SPARK_EXECUTOR_MEMORY, Memory per Worker (e.g. 1000M, 2G) (Default: 1G)
SPARK_WORKER_MEMORY, to set how much total memory workers have to give executors (e.g. 1000m, 2g)

what is the relationship of above 3 parameters?

As I understand, DRIVER_MEMORY is the max memory master node/process can request. But for driver, how about multiple machine situation, eg. 1 master machine and 2 worker machine, worker machine should also have some memory available for spark driver?

EXECUTOR_MEMORY and WORKER_MEMORY are the same to me, just different names, could this also be explained please?

Thank you very much.

标签： linux memory apache-spark environment-variables config

1条回答

Lonely孤独者°

2楼-- · 2019-03-14 02:31

First, you should know that 1 Worker (you can say 1 machine or 1 Worker Node) can launch multiple Executors (or multiple Worker Instances - the term they use in the docs).

SPARK_WORKER_MEMORY is only used in standalone deploy mode
SPARK_EXECUTOR_MEMORY is used in YARN deploy mode

In Standalone mode, you set SPARK_WORKER_MEMORY to the total amount of memory can be used on one machine (All Executors on this machine) to run your spark applications.

In contrast, In YARN mode, you set SPARK_DRIVER_MEMORY to the memory of one Executor

SPARK_DRIVER_MEMORY is used in YARN deploy mode, specifying the memory for the Driver that runs your application & communicates with Cluster Manager.

0人赞添加讨论(0) 举报

Spark configuration, what is the difference of SPA

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间