Specifiying custom profilers for pyspark running S

2019-09-15 15:01发布

问题:

I would like to know how to specify a custom profiler class in PySpark for Spark version 2+. Under 1.6, I know I can do so like this:

sc = SparkContext('local', 'test', profiler_cls='MyProfiler')

but when I create the SparkSession in 2.0 I don't explicitly have access to the SparkContext. Can someone please advise how to do this for Spark 2.0+ ?

回答1:

SparkSession can be initialized with an existing SparkContext, for example:

from pyspark import SparkContext
from pyspark.sql import SparkSession
from pyspark.profiler import BasicProfiler

spark = SparkSession(SparkContext('local', 'test', profiler_cls=BasicProfiler))