How are Spark Executors launched if Spark (on YARN

2019-08-02 02:04发布

I have a question regarding Apache Spark running on YARN in cluster mode. According to this thread, Spark itself does not have to be installed on every (worker) node in the cluster. My problem is with the Spark Executors: In general, YARN or rather the Resource Manager is supposed to decide about resource allocation. Hence, Spark Executors could be launched randomly on any (worker) node in the cluster. But then, how can Spark Executors be launched by YARN if Spark is not installed on any (worker) node?

标签： hadoop apache-spark yarn

1条回答

别忘想泡老子

2楼-- · 2019-08-02 02:22

In a high level, When Spark application launched on YARN,

An Application Master(Spark specific) will be created in one of the YARN Container.
Other YARN Containers used for Spark workers(Executors)

Spark driver will pass serialized actions(code) to executors to process data.

spark-assembly provides spark related jars to run Spark jobs on a YARN cluster and application will have its own functional related jars.

Edit: (2017-01-04)

Spark 2.0 no longer requires a fat assembly jar for production deployment.source

0人赞添加讨论(0) 举报

How are Spark Executors launched if Spark (on YARN

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间