I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:
Exception: Java gateway process exited before sending the driver its port number
when sc = SparkContext() is being called upon startup. I have tried running the following commands:
./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
with no avail. I have also looked here:
Spark + Python - Java gateway process exited before sending the driver its port number?
but the question has never been answered. Please help! Thanks.
I go this error fixed by using the below code. I had setup the SPARK_HOME though. You may follow this simple steps from eproblems website
For me, the answer was to add two 'Content Roots' in 'File' -> 'Project Structure' -> 'Modules' (in IntelliJ):
If you are trying to run spark without hadoop binaries, you might encounter the above mentioned error. One solution is to :
1) download hadoop separatedly.
2) add hadoop to your PATH
3) add hadoop classpath to your SPARK install
The first two steps are trivial, the last step can be best done by adding the following in the $SPARK_HOME/conf/spark-env.sh in each spark node (master and workers)
for more info also check: https://spark.apache.org/docs/latest/hadoop-provided.html
I use Mac OS. I fixed the problem!
Below is how I fixed it.
JDK8 seems works fine. (https://github.com/jupyter/jupyter/issues/248)
So I checked my JDK /Library/Java/JavaVirtualMachines, I only have jdk-11.jdk in this path.
I downloaded JDK8 (I followed the link). Which is:
After this, I added
to ~/.bash_profile file. (you sholud check your jdk1.8 file name)
It works now! Hope this help :)
After spending hours and hours trying many different solutions, I can confirm that Java 10 SDK causes this error. On Mac, please navigate to /Library/Java/JavaVirtualMachines then run this command to uninstall Java JDK 10 completely:
After that, please download JDK 8 then the problem will be solved.
this should help you
One solution is adding pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:
There is a change in python/pyspark/java_gateway.py , which requires PYSPARK_SUBMIT_ARGS includes pyspark-shell if a PYSPARK_SUBMIT_ARGS variable is set by a user.