I want to run a spark script and drop into an IPython shell to interactively examine data.
Running both:
$ IPYTHON=1 pyspark --master local[2] myscript.py
and
$ IPYTHON=1 spark-submit --master local[2] myscript.py
both exit out of IPython once done.
This seems really simple, but can't find how to do it anywhere.
If you launch the iPython shell with:
$ IPYTHON=1 pyspark --master local[2]
you can do:
>>> %run myscript.py
and all variables will stay in the workspace. You can also debug step by step with:
>>> %run -d myscript.py
Launch the IPython shell using IPYTHON=1 pyspark
, then run execfile('/path/to/myscript.py')
, that should run your script inside the shell and return back to it.