diffrence between hbase copy and snapshot command

2019-07-25 13:59发布

问题:

I have a table in hbase which contain a huge amount of data I want to take the back of the table so in this situation which is good

1--Copy command to take the back up of the table 2--Take the snapshot of that table

And also please explain the internal mechanism of snapshot Is it simply renaming the table?

Regards Amit

回答1:

snapshot is best.

  • HBase Snapshots allow you to take a snapshot of a table without too much impact on Region Servers. Snapshot, Clone and restore operations don't involve data copying. Also, Exporting the snapshot to another cluster doesn't have impact on the Region Servers.

Prior to version 0.94.6, the only way to backup or to clone a table is to use CopyTable/ExportTable, or to copy all the hfiles in HDFS after disabling the table. The disadvantages of these methods are that you can degrade region server performance (Copy/Export Table) or you need to disable the table, that means no reads or writes; and this is usually unacceptable.

  • Snapshot is not just rename, between multiple operations if you want to restore at one particular point then this is the right case to use : A snapshot is a set of metadata information that allows an admin to get back to a previous state of the table. A snapshot is not a copy of the table; it’s just a list of file names and doesn’t copy the data. A full snapshot restore means that you get back to the previous “table schema” and you get back your previous data losing any changes made since the snapshot was taken.

Also, see Snapshots+and+Repeatable+reads+for+HBase+Tables

Snapshot Internals



标签: hadoop hbase