In Google Spanner, is it possible that the exact s

2019-07-21 17:34发布

问题:

In Google Spanner, commit timestamps are generated by the server and based on "TrueTime" as discussed in https://cloud.google.com/spanner/docs/commit-timestamp. This page also states that timestamps are not guarnateed to be unique, so multiple independent writers can generate timestamps that are exactly the same.

On the documentation of consistency guarantees, it is stated that In addition if one transaction completes before another transaction starts to commit, the system guarantees that clients can never see a state that includes the effect of the second transaction but not the first.

What I'm trying to understand is the combination of

  1. Multiple concurrent transactions committing "at the same time" resulting in the same commit timestamp (where the commit timestamp forms part of a key for the table)
  2. A reader observing new rows being entered into above table

Under these circumstances, is it possible that a reader can observe some but not all of the rows that will (eventually) be stored with the exact same timestamp? Or put differently, if searching for all rows up to a known exact timestamp, and with rows are being inserted with that timestamp, is it possible that the query first returns some of the results, but when executed again returns more?

The context of this is an attempt to model a stream of events ordered by time in an append only manner - I need to be able to keep what is effectively a cursor to a particular point in time (point in the stream of events) and need to know whether or not having observed events at time T means you can never get more events again at exactly time T.

回答1:

Spanner is externally consistent, meaning that any reader will only be able to read the results of completed transactions...

Along with all externally consistent DB's, it is not possible for a reader outside of a transaction to be able to read the 'pending state' of another transaction. So a reader at time T will only be able to see transactions that have been committed before time T.

Multiple simultaneous insert/update transactions at commit time T (which would affect different rows, otherwise they could not be simultaneous) would not be seen by the reader at time T, but both would be seen by a reader at T+1

I ... need to know whether or not having observed events at time T means you can never get more events again at exactly time T.

Yes - ish. Rephrasing slightly as this is nuanced:
Having read events up to and including time T means you will never get any more events occurring with time equal to or before time T

But remember that the commit timestamp column is a simple TIMESTAMP column where any value can be stored -- it is the application that requests that the value stored is the commit timestamp, and there is nothing at the DB level to stop the application storing any value it likes...

As always with Spanner, it is the application which has to enforce/maintain the data integrity.