A pyspark dataframe containing dot (e.g. "id.orig_h") will not allow to groupby
upon unless first renamed by withColumnRenamed
. Is there a workaround? "`a.b`"
doesn't seem to solve it.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
In my pyspark shell, the following snippets are working:
from pyspark.sql.functions import *
myCol = col("`id.orig_h`")
result = df.groupBy(myCol).agg(...)
and
myCol = df["`id.orig_h`"]
result = df.groupBy(myCol).agg(...)
I hope it helps.