Bulk load to multiple HBase tables in single job

2019-09-05 06:42发布

I want to bulk load data to mulitple tables using a single mapreduce job.Since the data volumes is high ,It would be time consuming to iterate through dataset twice and load using multiple jobs.Is there any way to do this ? Thanks in advance.

1条回答
爷的心禁止访问
2楼-- · 2019-09-05 07:20

I am using Hbase. But i didnt need bulk load yet. But I came across this article which might help you.

http://hbase.apache.org/book/arch.bulk.load.html

The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly loads the generated StoreFiles into a running cluster. Using bulk load will use less CPU and network resources than simply using the HBase API.

查看更多
登录 后发表回答