Pyspark Import error in Python2.7.x

2019-09-15 02:21发布

问题:

trying to use Pyspark2.0.2-hadoop2.7 provides error while using with Python2.7.x

Code as:

import os

import sys

os.environ['SPARK_HOME']="C:/Apache/spark-2.0.2-bin-hadoop2.7"

sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python")

try:

  from pyspark import SparkContext

  from pyspark import SparkConf

  print("Succesfull")

except ImportError as e:

  print("Cannot import PYspark module", e)

  sys.exit(1)

as i run this code provide "Cannot import PYspark module" message.

Thanks

回答1:

Extend python path by both pyspark and py4j, for spark 2.0.2 it will be:

sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip")
sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip")