Is a ZooKeeper snapshot file enough to restore sta

2019-02-19 23:40发布

问题:

I am learning about ZooKeeper and looking at options to back up data stored in ZooKeeper. ZooKeeper writes two data files, snapshot and transaction log. It is often mentioned that snapshots are "fuzzy" and need a transaction log to be replayed over them to get an up to date state.

In the case of Observers, no transaction log is persisted to disk. If I were to take the snapshot written by an observer (or leader/follower without the transaction log), and placed it into a new standalone ZooKeeper, would ZooKeeper's state be guaranteed to be the same as it was when the snapshot was written to disk?

In other words, to perform a backup of ZooKeeper to its current state, you need the snapshot and transaction log. If I was content with backing up only to the time the snapshot was taken, would the snapshot alone be enough?

回答1:

No. The snapshot file is not enough to guarantee a return to a previous state. In fact, the snapshot file may not even represent the state of the tree at any point in time.

From the O'Reilly ZooKeeper book:

Let’s walk through an example to illustrate this. Say that a data tree has only two znodes: /z and /z'. Initially, the data of both /z and /z' is the integer 1 Now consider the following sequence of steps:

  1. Start a snapshot.
  2. Serialize and write /z = 1 to the snapshot.
  3. Set the data of /z to 2 (transaction T).
  4. Set the data of /z' to 2 (transaction Tʹ ).
  5. Serialize and write /z' = 2 to the snapshot.

This snapshot contains /z = 1 and /z' = 2. However, there has never been a point intime in which the values of both znodes were like that. This is not a problem, though,because the server replays transactions. It tags each snapshot with the last transaction that has been committed when the snapshot starts—call it TS. If the server eventually loads the snapshot, it replays all transactions in the transaction log that come after TS. In this case, they are T and Tʹ . After replaying T and Tʹ on top of the snapshot, the server obtains /z = 2 and /z' = 2, which is a valid state.

You may find with your ZooKeeper data structure that the fuzzy snapshot is acceptable but if you want to guarantee a valid tree take both the snapshot and transaction log.