From this data frame
|store| values |
| 1|[1, 2, 3,4, 5, 6]|
| 2| [2,3]|
I would like to apply the Counter
function to get this:
|store| values |
| 1|{1:1, 2:1, 3:1, 4:1, 5:1, 6:1}|
| 2|{2:1, 3:1} |
I got this data frame using the answer of another question :
GroupBy and concat array columns pyspark
So I try to modify the code that is in the answers like this:
Option 1:
def flatten_counter(val):
return Counter(reduce (lambda x, y:x+y, val))
udf_flatten_counter = sf.udf(flatten_counter, ty.ArrayType(ty.IntegerType()))
df3 ="store", flatten_counter("values2").alias("values3"))
Option 2: r: (, r.values)).reduceByKey(lambda x, y: x + y).map(lambda row: Counter(row[1])).toDF(['store', 'values']).show()
but it doesn't work.
Does anybody know how can I do it?
Thank you