Getting application ID from SparkR to create Spark

2019-07-13 05:14发布

From the SparkR shell, I'd like to generate a link to view the Spark UI while in Yarn mode. Normally the Spark UI is at port 4040, but in Yarn mode apparently it is at something like [host]:9046/proxy/application_1234567890123_0001/, where the last part of the path is the unique applicationId.

Other SO answers show how to get the applicationID for the Scala and Python shells. How do we get the applicationID from SparkR?

As a stab in the dark I tried SparkR:::callJMethod(sc, "applicationId"), but it didn't work.

I also tried something along the lines of system("yarn application -list"), but that doesn't seem to work from RStudio and has other limitations.

2条回答
做个烂人
2楼-- · 2019-07-13 05:48

After creating the spark session you can do the following to get the Spark application id.

print(sparkR.conf("spark.app.id"))
查看更多
\"骚年 ilove
3楼-- · 2019-07-13 05:51

You can directly follow the link from the YARN web UI to get to the Spark UI. From the YARN web UI at port 8088 you can click on 'Running Applications' and that should show you a link to the Application status page.

If you want to use callJMethod to get the application id you can use something like SparkR:::callJMethod(SparkR:::callJMethod(sc, "sc"), "applicationId").

The reason we need this nested call to sc is because sc is a JavaSparkContext handle and applicationId is only available in the Scala SparkContext.

查看更多
登录 后发表回答