In Apache Pig I want to serialise columns held in a variable into rows. More specifically:
The data, loaded into the variable, look (via DUMP
) like
(val1a, val2a,.... )
(val1b, val2b,val3b,.... )
(val1c, val2c,.... )
.
.
.
and I want to transform this into
(val1a)
(val2a)
.
.
.
(val1b)
(val2b)
(val3b)
.
.
.
(val1c)
(val2c)
.
.
.
So, each column has to be "serialised" into rows and then these rows are added subsequently. Please note: I do not necessarily know how many columns are in each row.
How can I do this in Pig Latin? It would be easy in, e.g., Python, but I don't know how to do it in Pig. I tried different foreach
... generate
constructs, but could not make it work.
One way to unfold tuples and create multiple tuples, each containing one field:
Note: You might also check these similar posts:
Splitting a tuple into multiple tuples in Pig
Pivot table with Apache Pig