How to load a flat file(not delimited file)

2019-09-09 20:43发布

I am new to hbase and I have a flat file(not delimited file) that I would like to load into a single hbase table.

Here is a preview of a row in my file:

0107E07201512310015071C11100747012015123100

I know fo example that from position 1 to 7 it's an id and from position 7 to 15 it's a date....

The problem is how to build a schema that correspond to my file or if there is a way to convert it to a delimited file or read such file using jaql because I'm working with Infosphere BigInsights.

Any help would be greatly appreciated.

Thanks in advance.

2条回答
我欲成王,谁敢阻挡
2楼-- · 2019-09-09 21:18

You can write a SerDe to deserialize into Hive and use Hive to export to HBase.

查看更多
走好不送
3楼-- · 2019-09-09 21:29

Create a Hive table using RegExSerDe

CREATE EXTERNAL TABLE testtable ((col1 STRING, col2 STRING, col3 STRING)
ROW FORMAT SERDE ‘org.apache.hadoop.hive.contrib.serde2.RegexSerDe’
WITH SERDEPROPERTIES (“input.regex” = “(.{5})(.{6})(.{3}).*” )
LOCATION ‘<hdfs-file-location>’;

You can create hive table pointing to HBase Here are the instructions http://hortonworks.com/blog/hbase-via-hive-part-1/

You can use insert overwrite table to load the data from hive table to HBase-table https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-SELECTSandFILTERS

查看更多
登录 后发表回答