I have saved my crawled data by nutch in Hbase whose file system is hdfs. Then I copied my data (One table of hbase) from hdfs directly to some local directory by command
hadoop fs -CopyToLocal /hbase/input ~/Documents/output
After that, I copied that data back to another hbase (other system) by following command
hadoop fs -CopyFromLocal ~/Documents/input /hbase/mydata
It is saved in hdfs and when I use list
command in hbase shell, it shows it as another table i.e 'mydata' but when I run scan
command, it says there is no table with 'mydata' name.
What is problem with above procedure? In simple words:
- I want to copy hbase table to my local file system by using a hadoop command
- Then, I want to save it directly in hdfs in another system by hadoop command
- Finally, I want the table to be appeared in hbase and display its data as the original table
If you want to export the table from one hbase cluster and import it to another, use any one of the following method:
Using Hadoop
Export
NOTE: Copy the output directory in hdfs from the source to destination cluster
Import
Note: Both outputdir and inputdir are in hdfs.
Using Hbase
Export
Copy the output directory in hdfs from the source to destination cluster
Import
Reference: Hbase tool to export and import
If you can use the Hbase command instead to backup hbase tables you can use the Hbase ExportSnapshot Tool which copies the hfiles,logs and snapshot metadata to other filesystem(local/hdfs/s3) using a map reduce job.
Take snapshot of the table
$ ./bin/hbase shell hbase> snapshot 'myTable', 'myTableSnapshot-122112'
Export to the required file system
$ ./bin/hbase class org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot MySnapshot -copy-to fs://path_to_your_directory
You can export it back from the local file system to hdfs:///srv2:8082/hbase and run the restore command from hbase shell to recover the table from the snapshot.
Reference:Hbase Snapshots