I'm Looking for a way to split the column based on comma delimited data. Below is my dataset
id col1 col2
1 5,6 7,8
I want to get the result
id col1 col2
1 5 7
1 6 8
The position of the index should match because I need to fetch results accordingly.
I tried the below query but it returns the cartesian product.
Query:
SELECT col3, col4
FROM test ext
lateral VIEW explode(split(col1,'\002')) col1 AS col3
lateral VIEW explode(split(col2,'\002')) col2 AS col4
Result:
id col1 col2
1 5 7
1 5 8
1 6 7
1 6 8
You can use
posexplode()
to create position index columns for your split arrays. Then, select only those rows where the position indices are equal.Output:
Reference: Hive language manual - posexplode()