I'm trying to configure apache-spark on MacOS. All the online guides ask to either download the spark tar and set up some env variables or to use brew install apache-spark
and then setup some env variables.
Now I installed apache-spark using brew install apache-spark
.
I run pyspark
in terminal and I am getting a python prompt which suggests that the installation was successful.
Now when I try to do import pyspark
into my python file, I'm facing error saying ImportError: No module named pyspark
The strangest thing I'm not able to understand is how is it able to start an REPL of pyspark and not able to import the module into python code.
I also tried doing pip install pyspark
but it does not recognize the module either.
In addition to installing apache-spark with homebrew, I've set up following env variables.
if which java > /dev/null; then export JAVA_HOME=$(/usr/libexec/java_home); fi
if which pyspark > /dev/null; then
export SPARK_HOME="/usr/local/Cellar/apache-spark/2.1.0/libexec/"
export PYSPARK_SUBMIT_ARGS="--master local[2]"
fi
Please suggest what exactly is missing on my setup to run pyspark code on my local machine.