Pyspark error - Unsupported class file major versi

2020-01-23 07:07发布



To fix this issue I edited the bash_profile to ensure java 1.8 is used as the global default as follows:

touch ~/.bash_profile; open ~/.bash_profile


export JAVA_HOME=$(/usr/libexec/java_home -v 1.8) 

and saving within text edit.


Due to license changes from Oracle the above fix might not work and you may encounter issues installing via brew. In order to install Java 8 you may need to follow this guide.


I'm trying to install Spark on my Mac. I've used home-brew to install spark 2.4.0 and Scala. I've installed PySpark in my anaconda environment and am using PyCharm for development. I've exported to my bash profile:

export SPARK_VERSION=`ls /usr/local/Cellar/apache-spark/ | sort | tail -1`
export SPARK_HOME="/usr/local/Cellar/apache-spark/$SPARK_VERSION/libexec"

However I'm unable to get it to work.

I suspect this is due to java version from reading the traceback. I would really appreciate some help fixed the issue. Please comment if there is any information I could provide that is helpful beyond the traceback.

I am getting the following error:

Traceback (most recent call last):
  File "<input>", line 4, in <module>
  File "/anaconda3/envs/coda/lib/python3.6/site-packages/pyspark/", line 816, in collect
    sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
  File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/", line 1257, in __call__
    answer, self.gateway_client, self.target_id,
  File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException: Unsupported class file major version 55


Until Spark supports Java 11 (which would be hopefully be mentioned at the latest documentation when it is), you have to add in a flag to set your Java version to Java 8.

As of Spark 2.4.x

Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.4 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x)

On a Mac, I am able to do this in my .bashrc,

export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)

You can also set this in rather than set the variable for your whole profile.

And you'll need to install Java 8 in addition to your existing Java 11


I ran into this issue when running Jupyter Notebook and Spark using Java 11. I installed and configured for Java 8 using the following steps.

Install Java 8:

$ sudo apt install openjdk-8-jdk

Since I had already installed Java 11, I then set my default Java to version 8 using:

$ sudo update-alternatives --config java

Select Java 8 and then confirm your changes:

$ java -version

Output should be similar to:

openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

I'm now able to run Spark successfully in Jupyter Notebook. The steps above were based on the following guide:


With pycharm I found that adding the spark location through findspark and java8 with os at the beginning of the script the easiest solution:

import findspark
import os
spark_location='/opt/spark-2.4.3/' # Set your own
java8_location= '/usr/lib/jvm/java-8-openjdk-amd64' # Set your own
os.environ['JAVA_HOME'] = java8_location


On windows (Windows 10) you can solve the issue by installing jdk-8u201-windows-x64.exe and resetting the system environment variable to the correct version of the JAVA JDK:

JAVA_HOME -> C:\Program Files\Java\jdk1.8.0_201.

Don't forget to restart the terminal otherwise the resetting of the environment variable does not kick in.


The problem hear is that PySpark requirs Java 8 for some functions. Spark 2.2.1 was having problems with Java 9 and beyond. The recommended solution was to install Java 8.

you can install java-8 specifically, and set it as your default java and try again.

to install java 8,

sudo apt install openjdk-8-jdk

to change the default java version, follow this. you can use command

 update-java-alternatives --list

for listing all java versions available.

set a default one by running the command:

sudo update-alternatives --config java

to select java version you want. provide the accurate number in the provided list. then cheak your java version java -version and it should be updated. Set the JAVA_HOME variable also.

to set JAVA_HOME, You must find the specific Java version and folder. Fallow this SO discussion for get a full idea of setting the java home variable. since we are going to use java 8, our folder path is /usr/lib/jvm/java-8-openjdk-amd64/ . just go to /usr/lib/jvm folder and creak what are the avilable folders. use ls -l to see folders and their softlinks, since these folders can be a shortcut for some java versions. then go to your home directory cd ~ and edit the bashrc file

cd ~
gedit .bashrc

then Add bellow lines to the file, save and exit.

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$JAVA_HOME/bin


For Debian 10 'buster' users, Java 8 JRE is available in the nvidia-openjdk-8-jre package.

Install it with

sudo apt install nvidia-openjdk-8-jre

Then set JAVA_HOME when running pyspark, e.g.:

JAVA_HOME=/usr/lib/jvm/nvidia-java-8-openjdk-amd64/ pyspark


I have the same issue in windows, and I have added JAVA_HOME to the environmental variable path:

JAVA_HOME: C:\Program Files\Java\jdk-11.0.1