可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This is the snippet:

from pyspark import SparkContext
from pyspark.sql.session import SparkSession

sc = SparkContext()
spark = SparkSession(sc)
d = spark.read.format("csv").option("header", True).option("inferSchema", True).load('file.csv')
d.show()

After this runs into the error:

An error occurred while calling o163.showString. Trace:
py4j.Py4JException: Method showString([class java.lang.Integer, class java.lang.Integer, class java.lang.Boolean]) does not exist

All the other methods work well. Tried researching alot but in vain. Any lead will be highly appreciated

回答1:

This is an indicator of a Spark version mismatch. Before Spark 2.3 show method took only two arguments:

def show(self, n=20, truncate=True):

since 2.3 it takes three arguments:

def show(self, n=20, truncate=True, vertical=False):

In your case Python client seems to invoke the latter one, while the JVM backend uses the older version.

Since SparkContext initialization undergone significant changes in 2.4, which would cause failure on SparkContext.__init__, you're likely using:

2.3.x Python library.
2.2.x JARs.

You can confirm that by checking versions directly from your session, Python:

sc.version

vs. JVM:

sc._jsc.version()

Problems like this, are usually a result of misconfigured PYTHONPATH (either directly, or by using pip installed PySpark on top per-existing Spark binaries) or SPARK_HOME.

回答2:

On spark-shell console, enter the variable name and see the data type. As an alternative, you can tab twice after variable named. and it will show necessary function which could be applied. Example of a DataFrame object.

res23: org.apache.spark.sql.DataFrame = [order_id: string, book_name: string ... 1 more field]

Method showString([class java.lang.Integer, class

问题:

回答1:

回答2:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮