hive - split a row into multiple rows between the

2019-07-29 22:39发布

I have a table below and would like to split the rows by the range from start to end columns.

i.e id and value should repeat for each value between start & end(both inclusive)

--------------------------------------
id      | value   | start     | end
--------------------------------------
1       | 5       | 1         | 4
2       | 8       | 5         | 9
--------------------------------------

Desired output

--------------------------------------
id      | value   | current
--------------------------------------
1       | 5       | 1
1       | 5       | 2
1       | 5       | 3
1       | 5       | 4

2       | 8       | 5
2       | 8       | 6
2       | 8       | 7
2       | 8       | 8
2       | 8       | 9
--------------------------------------

I can write my own UDF in java/python to get this result but would like to check if I can implement in Hive SQL using any existing hive UDFs

Thanks in advance.

1条回答
疯言疯语
2楼-- · 2019-07-29 22:53

This can be accomplished with a recursive common table expression, which Hive doesn't support.

One option is to create a table of numbers and use it to generate rows between start and end.

create table numbers
location 'hdfs_location' as 
select row_number() over(order by somecolumn) as num
from some_table --this can be any table with the desired number of rows 
;

--Join it with the existing table 
select t.id,t.value,n.num as current
from tbl t
join numbers n on n.num>=t.start and n.num<=t.end
查看更多
登录 后发表回答