Is Hive's collect_list ordered?

2019-04-11 21:11发布

问题:

This page says of collect_list:

Returns a list of objects with duplicates.

Is that list ordered? For example, the order of the query results?

回答1:

built-in collect_list isn't guaranteed to be ordered, even if you do an order by first (even if it did ensure order, doing it this way is a waste of time). Just use brickhouse collect; it ensures the elements are ordered.



回答2:

It's correct that collect_list isn't guaranteed to be ordered. The function sort_array will sort the result:

   select a, b, sort_array(collect_list(c)) as sorted_c
   from the_table
   group by a, b


标签: hive hiveql