可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
FIX:
To fix this issue I edited the bash_profile to ensure java 1.8 is used as the global default as follows:
touch ~/.bash_profile; open ~/.bash_profile
Adding
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
and saving within text edit.
UPDATE
Due to license changes from Oracle the above fix might not work and you may encounter issues installing via brew. In order to install Java 8 you may need to follow this guide.
QUESTION:
I'm trying to install Spark on my Mac. I've used home-brew to install spark 2.4.0 and Scala. I've installed PySpark in my anaconda environment and am using PyCharm for development. I've exported to my bash profile:
export SPARK_VERSION=`ls /usr/local/Cellar/apache-spark/ | sort | tail -1`
export SPARK_HOME="/usr/local/Cellar/apache-spark/$SPARK_VERSION/libexec"
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
However I'm unable to get it to work.
I suspect this is due to java version from reading the traceback. I would really appreciate some help fixed the issue. Please comment if there is any information I could provide that is helpful beyond the traceback.
I am getting the following error:
Traceback (most recent call last):
File "<input>", line 4, in <module>
File "/anaconda3/envs/coda/lib/python3.6/site-packages/pyspark/rdd.py", line 816, in collect
sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException: Unsupported class file major version 55
回答1:
Until Spark supports Java 11 (which would be hopefully be mentioned at the latest documentation when it is), you have to add in a flag to set your Java version to Java 8.
As of Spark 2.4.x
Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.4 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x)
On a Mac, I am able to do this in my .bashrc
,
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
You can also set this in spark-env.sh
rather than set the variable for your whole profile.
And you'll need to install Java 8 in addition to your existing Java 11
回答2:
I ran into this issue when running Jupyter Notebook and Spark using Java 11. I installed and configured for Java 8 using the following steps.
Install Java 8:
$ sudo apt install openjdk-8-jdk
Since I had already installed Java 11, I then set my default Java to version 8 using:
$ sudo update-alternatives --config java
Select Java 8 and then confirm your changes:
$ java -version
Output should be similar to:
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
I'm now able to run Spark successfully in Jupyter Notebook. The steps above were based on the following guide: https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-on-ubuntu-18-04
回答3:
With pycharm I found that adding the spark location through findspark and java8 with os at the beginning of the script the easiest solution:
import findspark
import os
spark_location='/opt/spark-2.4.3/' # Set your own
java8_location= '/usr/lib/jvm/java-8-openjdk-amd64' # Set your own
os.environ['JAVA_HOME'] = java8_location
findspark.init(spark_home=spark_location)
回答4:
On windows (Windows 10) you can solve the issue by installing jdk-8u201-windows-x64.exe and resetting the system environment variable to the correct version of the JAVA JDK:
JAVA_HOME -> C:\Program Files\Java\jdk1.8.0_201.
Don't forget to restart the terminal otherwise the resetting of the environment variable does not kick in.
回答5:
The problem hear is that PySpark requirs Java 8 for some functions. Spark 2.2.1 was having problems with Java 9 and beyond. The recommended solution was to install Java 8.
you can install java-8 specifically, and set it as your default java and try again.
to install java 8,
sudo apt install openjdk-8-jdk
to change the default java version, follow this. you can use command
update-java-alternatives --list
for listing all java versions available.
set a default one by running the command:
sudo update-alternatives --config java
to select java version you want. provide the accurate number in the provided list.
then cheak your java version java -version
and it should be updated. Set the JAVA_HOME variable also.
to set JAVA_HOME, You must find the specific Java version and folder. Fallow this SO discussion for get a full idea of setting the java home variable. since we are going to use java 8, our folder path is /usr/lib/jvm/java-8-openjdk-amd64/
. just go to /usr/lib/jvm
folder and creak what are the avilable folders. use ls -l
to see folders and their softlinks, since these folders can be a shortcut for some java versions. then go to your home directory cd ~
and edit the bashrc file
cd ~
gedit .bashrc
then Add bellow lines to the file, save and exit.
## SETTING JAVA HOME
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:$JAVA_HOME/bin
回答6:
For Debian 10 'buster' users, Java 8 JRE is available in the nvidia-openjdk-8-jre
package.
Install it with
sudo apt install nvidia-openjdk-8-jre
Then set JAVA_HOME
when running pyspark
, e.g.:
JAVA_HOME=/usr/lib/jvm/nvidia-java-8-openjdk-amd64/ pyspark
回答7:
I have the same issue in windows, and I have added JAVA_HOME to the environmental variable path:
JAVA_HOME: C:\Program Files\Java\jdk-11.0.1