I have two arrays of string in Hive like
{'value1','value2','value3'}
{'value1', 'value2'}
I want to merge arrays without duplicates, result:
{'value1','value2','value3'}
How I can do it in hive?
I have two arrays of string in Hive like
{'value1','value2','value3'}
{'value1', 'value2'}
I want to merge arrays without duplicates, result:
{'value1','value2','value3'}
How I can do it in hive?
A native solution could be that:
Firstly explode with lateralview, and next group by and remove duplicates with collect_set.
You will need a UDF for this. Klout has a bunch of opensource HivUDFS under the package brickhouse. Here is the github link. They have a bunch of UDF's that exactly serves your purpose. Download,build and add the JAR. Here is an example