How can I run Pyspark interactively in Jupyter usi

2019-09-20 10:46发布

问题:

Now I have succeeded in running Pyspark in Jupyter in local mode by the second method as mentioned in this blog. Here is the code:

import findspark
findspark.init()
from pyspark import SparkContext
sc = SparkContext("local", "First App")

I want to run it interactively in YARN-client mode,how can I do it? Let's go futher,how to run in different modes,e.g.standalone mode and YARN-cluster mode.

回答1:

Accrding to the Docs :

Master URLs accepts yarn parameter based on the HADOOP_CONF_DIR or YARN_CONF_DIR variable

So I can simply use: sc = SparkContext("yarn-client", "First App")