Is it possible to append a value to the PYTHONPATH of a worker in spark?
I know it is possible to go to each worker node, configure spark-env.sh file and do it, but I want a more flexible approach
I am trying to use setExecutorEnv method, but with no success
conf = SparkConf().setMaster("spark://192.168.10.11:7077")\
.setAppName(''myname')\
.set("spark.cassandra.connection.host", "192.168.10.11") /
.setExecutorEnv('PYTHONPATH', '$PYTHONPATH:/custom_dir_that_I_want_to_append/')
It creates a pythonpath env.variable on each executor, force it to be lower_case, and does not interprets $PYTHONPATH command to append the value.
I end up with two different env.variables,
pythonpath : $PYTHONPATH:/custom_dir_that_I_want_to_append
PYTHONPATH : /old/path/to_python
The first one is dynamically created and the second one already existed before.
Does anyone know how to do it?
I figured out myself...
The problem is not with spark, but in ConfigParser
Based on this answer, I fixed the ConfigParser to always preserve case.
After this, I found out that the default spark behavior is to append the values to existing worker env.variables, if there is a env.variable with the same name.
So, it is not necessary to mention $PYTHONPATH within dollar sign.