Pyspark: Exception: Java gateway process exited be

2020-01-24 22:59发布

I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:

Exception: Java gateway process exited before sending the driver its port number

when sc = SparkContext() is being called upon startup. I have tried running the following commands:

./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

with no avail. I have also looked here:

Spark + Python - Java gateway process exited before sending the driver its port number?

but the question has never been answered. Please help! Thanks.

25条回答
▲ chillily
2楼-- · 2020-01-24 23:18

Spark is very picky with the Java version you use. It is highly recommended that you use Java 1.8 (The open source AdoptOpenJDK 8 works well too). After install it, set JAVA_HOME to your bash variables, if you use Mac/Linux:

export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)

export PATH=$JAVA_HOME/bin:$PATH

查看更多
Animai°情兽
3楼-- · 2020-01-24 23:20

I have the same error.

My trouble shooting procedures are:

  1. Check out Spark source code.
  2. Follow the error message. In my case: pyspark/java_gateway.py, line 93, in launch_gateway.
  3. Check the code logic to find the root cause then you will resolve it.

In my case the issue is PySpark has no permission to create some temporary directory, so I just run my IDE with sudo

查看更多
叛逆
4楼-- · 2020-01-24 23:21

I got the same Java gateway process exited......port number exception even though I set PYSPARK_SUBMIT_ARGS properly. I'm running Spark 1.6 and trying to get pyspark to work with IPython4/Jupyter (OS: ubuntu as VM guest).

While I got this exception, I noticed an hs_err_*.log was generated and it started with:

There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 715849728 bytes for committing reserved memory.

So I increased the memory allocated for my ubuntu via VirtualBox Setting and restarted the guest ubuntu. Then this Java gateway exception goes away and everything worked out fine.

查看更多
倾城 Initia
5楼-- · 2020-01-24 23:22

I had the same exception and I tried everything by setting and resetting all environment variables. But the issue in the end drilled down to space in appname property of spark session,that is, "SparkSession.builder.appName("StreamingDemo").getOrCreate()". Immediately after removing space from string given to appname property it got resolved.I was using pyspark 2.7 with eclipse on windows 10 environment. It worked for me. Enclosed are required screenshots.Error_with space

No Error_without space

查看更多
干净又极端
6楼-- · 2020-01-24 23:23

I figured out the problem in Windows system. The installation directory for Java must not have blanks in the path such as in C:\Program Files. I re-installed Java in C\Java. I set JAVA_HOME to C:\Java and the problem went away.

查看更多
够拽才男人
7楼-- · 2020-01-24 23:24

Had this error message running pyspark on Ubuntu, got rid of it by installing the openjdk-8-jdk package

from pyspark import SparkConf, SparkContext
sc = SparkContext(conf=SparkConf().setAppName("MyApp").setMaster("local"))
^^^ error

Install Open JDK 8:

apt-get install openjdk-8-jdk-headless -qq    
查看更多
登录 后发表回答