I want to bulk load data to mulitple tables using a single mapreduce job.Since the data volumes is high ,It would be time consuming to iterate through dataset twice and load using multiple jobs.Is there any way to do this ? Thanks in advance.
相关问题
- Unable to generate jar file for Hadoop
- how to calculate count and unique count over two f
- Spark - Group by Key then Count by Value
- Joining two ResultSets from HBase in Java?
- Save CSV file to hbase table using Spark and Phoen
相关文章
- mapreduce count example
- hbase-client 2.0.x error
- Exception in thread “main” java.lang.NoClassDefFou
- Compute first order derivative with MongoDB aggreg
- Difference between partial sort, total sort and se
- How to access the local data of a Cassandra node
- MapReduce WordCount Program - output is same as th
- Where Mapper output in Hadoop is saved?
I am using Hbase. But i didnt need bulk load yet. But I came across this article which might help you.
http://hbase.apache.org/book/arch.bulk.load.html
The bulk load feature uses a MapReduce job to output table data in HBase's internal data format, and then directly loads the generated StoreFiles into a running cluster. Using bulk load will use less CPU and network resources than simply using the HBase API.