I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:
Exception: Java gateway process exited before sending the driver its port number
when sc = SparkContext() is being called upon startup. I have tried running the following commands:
./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
with no avail. I have also looked here:
Spark + Python - Java gateway process exited before sending the driver its port number?
but the question has never been answered. Please help! Thanks.
Spark is very picky with the Java version you use. It is highly recommended that you use Java 1.8 (The open source AdoptOpenJDK 8 works well too). After install it, set
JAVA_HOME
to your bash variables, if you use Mac/Linux:export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
export PATH=$JAVA_HOME/bin:$PATH
I have the same error.
My trouble shooting procedures are:
pyspark/java_gateway.py
, line 93, inlaunch_gateway
.In my case the issue is PySpark has no permission to create some temporary directory, so I just run my IDE with sudo
I got the same
Java gateway process exited......port number
exception even though I setPYSPARK_SUBMIT_ARGS
properly. I'm running Spark 1.6 and trying to get pyspark to work with IPython4/Jupyter (OS: ubuntu as VM guest).While I got this exception, I noticed an hs_err_*.log was generated and it started with:
There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 715849728 bytes for committing reserved memory.
So I increased the memory allocated for my ubuntu via VirtualBox Setting and restarted the guest ubuntu. Then this
Java gateway
exception goes away and everything worked out fine.I had the same exception and I tried everything by setting and resetting all environment variables. But the issue in the end drilled down to space in appname property of spark session,that is, "SparkSession.builder.appName("StreamingDemo").getOrCreate()". Immediately after removing space from string given to appname property it got resolved.I was using pyspark 2.7 with eclipse on windows 10 environment. It worked for me. Enclosed are required screenshots.
I figured out the problem in Windows system. The installation directory for Java must not have blanks in the path such as in
C:\Program Files
. I re-installed Java inC\Java
. I setJAVA_HOME
toC:\Java
and the problem went away.Had this error message running pyspark on Ubuntu, got rid of it by installing the
openjdk-8-jdk
packageInstall Open JDK 8: