How can I use collect_set
or collect_list
on a dataframe after groupby
. for example: df.groupby('key').collect_set('values')
. I get an error: AttributeError: 'GroupedData' object has no attribute 'collect_set'
相关问题
- how to split a list into a given number of sub-lis
- Groupby with weight
- C#: How do i get 2 lists into one 2-tuple list in
- F#: Storing and mapping a list of functions
- Select first row from multiple dataframe and bind
相关文章
- List可以存储接口类型的数据吗?
- C#中 public virtual string Category { get; }这么写会报错:
-
C# List
.FindAll 时 空指针异常 - What is the complexity of bisect algorithm?
- Given a list and a bitmask, how do I return the va
- Why does slice [:-0] return empty list in Python
- Pyspark error: Java gateway process exited before
- Style bullet-list with arrows
You need to use agg. Example:
Note in the above you have to create a HiveContext. See https://stackoverflow.com/a/35529093/690430 for dealing with different Spark versions.