I am creating HDInsights cluster on Azure according to this desciption
Now I would like to set up spark custom parameter, for example
spark.yarn.appMasterEnv.PYSPARK3_PYTHON or spark_daemon_memory in time of cluster provisioning.
Is it possible to setup using Data Factory/Automation Account? I can not find any example doing this.
Thanks
You can use SparkConfig
in Data Factory to pass these configurations to Spark.
For example:
"typeProperties": {
...
"sparkConfig": {
"spark.submit.pyFiles": "/dist/package_name-1.0.0-py3.5.egg",
"spark.yarn.appMasterEnv.PYSPARK_PYTHON": "/usr/bin/anaconda/envs/py35/bin/python3"
}
}
This way you can specify all Spark configs that are listed in docs here.