How transfer a Table from HBase to Hive?

2019-03-06 09:19发布

How can I tranfer a HBase table into Hive correctly?

What I tried before can you read in this question How insert overwrite table in hive with diffrent where clauses? ( I made one table to import all data. The problem here is that data is still in rows and not in columns. So I made 3 tables for news, social and all with a specific where clause. After that I made 2 Joins on the tables which is giving me the result table. So I had 6 Tables at all which is not really performant!)

to sum my problem up : In HBase are column familys which are saved as rows like this.

count   verpassen   news    1
count   verpassen   social  0
count   verpassen   all 1

What I want to achieve in Hive is a datastructure like this:

name      news    social   all
verpassen 1       0        1

How am I supposed to do this?

1条回答
趁早两清
2楼-- · 2019-03-06 10:00

Below is the approach use can use.

use hbase storage handler to create the table in hive

example script

CREATE TABLE hbase_table_1(key string, value string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,f1:val") TBLPROPERTIES ("hbase.table.name" = "test");

I loaded the sample data you have given into hive external table.

enter image description here

select name,collect_set(concat_ws(',',type,val)) input from TESTTABLE group by name ;

i am grouping the data by name.The resultant output for the above query will be enter image description here

Now i wrote a custom mapper which takes the input as input parameter and emits the values.

from (select '["all,1","social,0","news,1"]' input from TESTTABLE group by name) d MAP d.input Using 'python test.py' as all,social,news

enter image description here

alternatively you can use the output to insert into another table which has column names name,all,social,news

Hope this helps

查看更多
登录 后发表回答