What do matrix clocks solve but vector clocks can&

2020-07-10 03:34发布

问题:

I understand the need for vector clocks in terms of scalar logical clocks failing to provide enough information to tell whether there is an update conflict in a key value store update for example.

But I am not sure what problem is still unsolved by vector clocks and then solved by the more bulky matrix clocks?

回答1:

In an eventual consistency environment all messages ever created by a system need to be kept until every peer has received the message (== eventual consistency). But you don't want to keep messages for ever, so you need to have a way to tell which messages were received by all nodes and can be deleted, this is why you use matrix clocks.

Matrix clocks are a list of vector clocks, so you know the current state of each node in the system. Based on this you can know which peer received already which messages. When you exchange messages with another node in the system you compare the matrix clocks and remember always the highest values for each node. Afterwards you can delete messages which were sent before, because the node already must have received them.

This is a very brief description of TSAE (timestamped anti-entropy) protocol. You can read more about it in the dissertation project Weak-consistency group communication and membership by Richard Andrew Golding from 1992 (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.7385&rep=rep1&type=pdf) starting from chapter 5.



回答2:

The distinctions among Lamport clock (scalar logical clock, in your term), vector clock, and matrix clock lie in that they represent different levels of knowledge.

For vector clock $vt_i[1 \ldots n]$ in site $i$, the entry $vt_i[k]$ represents the knowledge the site $S_i$ has about site $S_k$. The knowledge has the form of "$i$ knows $k$ that $\ldots$".

For matrix clock $mt_i[1 \ldots n, 1 \ldots n]$ in site $S_i$, the entry $mt_i[k,l]$ represents the knowledge the site $S_i$ has about the knowledge by $S_k$ about site $S_l$. The knowledge here the form of "$i$ knows $k$ knows $l$ that $\ldots$".

Intuitively, we can do more things with more knowledge.

The following description is mainly quoted from [1]:

Vector clocks and matrix clocks are widely used in asynchronous distributed message-passing systems.

Some example areas using vector clocks are checkpointing, causal memory, maintaining consistency of replicated files, global snapshot, global time approximation, termination detection, bounded multiwriter construction of shared variables, mutual exclusion and debugging (predicate detection).

Some example areas that use matrix clocks are designing fault-tolerant protocols and distributed database protocols, including protocols to discard obsolete information in distributed databases, and protocols to solve the replicated log and replicated dictionary problems.

For matrix clock, we notice that $min_k(mt_i[k,i]) \ge t$ means that site $S_i$ knows that every other site $k$ knows its progress till its local time $t$.

It is this property that allows a site to no longer send an information with a local time $\le t$ or to discard obsolete information.

[1] Concurrent Knowledge and Logical Clock Abstractions Ajay D. Kshemkalyani 2000