A previous question recommends sc.applicationId
, but it is not present in PySpark
, only in scala
.
So, how do I figure out the application id (for yarn
) of my PySpark process?
A previous question recommends sc.applicationId
, but it is not present in PySpark
, only in scala
.
So, how do I figure out the application id (for yarn
) of my PySpark process?
You could use Java SparkContext object through the Py4J RPC gateway:
>>> sc._jsc.sc().applicationId()
u'application_1433865536131_34483'
Please note that sc._jsc
is internal variable and not the part of public API - so there is (rather small) chance that it may be changed in the future.
I'll submit pull request to add public API call for this.
In Spark 1.6 (probably 1.5 according to @wladymyrov in comment on the other answer)
In [1]: sc.applicationId
Out[1]: u'local-1455827907865'