I am a newcomer to Postgres and love it so far. I've given this problem a lot of thought already, RTFM to the best of my ability, but hit a dead end, so I need a nudge in the right direction.
I'm designing a database where each entity of interest has a rowversion
column that gets assigned a value from a global sequence. So, in the simplest scenario, in a table emps
with two rows: emp1
with rowversion@3
and emp2
with rowversion@5
, I know emp2
was modified after emp1
(ie in a later transaction - don't really mind if rows within the same transaction have the same rowversion
).
This is to form the foundation of a data sync logic, where a client that knows they have everything up until @3, can get the latest updates using a query such as SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor
.
Here is an example scenario for a client already updated @3 - assume the following transactions since:
@3 - committed
@4 - committed
@5 - committed
@6 - in progress - not committed yet
@7 - committed
@8 - in progress - not committed yet
@9 - committed
Client update is performed in three stages:
- Ask the database for an appropriate
new_anchor
. - Perform
SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor
. - Pass the
new_anchor
value back to the client, along with the result data.
Since the rows with rowversion
@6 and @8 are still in progress, new_anchor
has to be @5, so that our range query doesn't miss any uncommitted updates. Now the client can be confident it has everything up until @5.
So the actual problem distilled: how can this new_anchor
be safely determined without forcing SERIALIZABLE
or otherwise seriously hurting performance?
As you can probably tell I've borrowed this idea from SQL Server, where this problem is trivially solved by the min_active_rowversion()
function. This function would return @6 in the above scenario, so your new_anchor
can safely be min_active_rowversion() - 1
. I sort of had an idea how this could be implemented in Postgres using an active_rowversions
table, triggers, and SELECT min(id) FROM active_rowversions
, but that would require READ UNCOMMITTED
isolation, which is not available in Postgres.
I would really appreciate any help or ideas.