I am planning to use postgres LISTEN/NOTIFY aproach to get insert time(actual transaction commit time) of records in a table. To achieve this, I plan to do the following. I issue a notification during insert time as shown below.
BEGIN;
INSERT INTO table_name(id, ...) values (id,....);
select pg_notify('test_channel', 'id - ' || id || ' trans start time - ' || now() || ' notify start time - ' || clock_timestamp());
END;
And then I plan to use https://pythonhosted.org/psycopg2/advanced.html#asynchronous-notifications to receive those notifications.
What I would like to find out is the exact time the transaction commit happens(the record is available to read) down to micro secods
I understand that NOTIFY(pg_notify) actually sends notification right after the commit of the transaction but I couldnt figure out how to find out the exact time when it happens. The clock timestamp value I have in NOTIFY, is not the acutal transaction commit time.
I guess the time I listen to notification will be close to transaction commit time but I am not sure how close it is. First, there is some time between polls in my code while listening(however small it is) and second, I am not sure if there is any lag between NOTIFY/LISTEN communication itself.
Any ideas?
UPDATE(Complete description of problem): We have a reader selecting rows in batches using a “checkpoint” time, where each batch gets the rows after the last timestamp in the previous batch, and we are missing rows. (Reason: The timestamp value is based on the time INSERT happens(00.00.00). At heavy loads, if the transaction takes longer time, it gets inserted let say 10 sec later(00.00.10), the reader will miss this row(row1) if it reads during that 10 seconds and finds a row which had its INSERT time at a later time(00.00.05) than row1. The complete description of the problem is similar to the one written in this blog. http://blog.thefourthparty.com/stopping-time-in-postgresql/)
What I would like to find out is the exact time the transaction commit happens(the record is available to read) down to micro secods
Handily, PostgreSQL 9.5 just added that, in the form of support for commit timestamps. See commit timestamps. Note that you must have track_commit_timestamp
enabled to use this, and that information about commit timestamps isn't kept forever, so fairly old rows will just get a null result.
You can get the transaction ID with txid_current()
at any point during the transaction. Perhaps using insert ... returning ...
for example. Then you can look up the commit timestamp in a subsequent query, after commit.
For older versions, you should just include clock_timestamp
in your insert ... returning ...
clause. It will be the time the record was inserted, not the commit time, but that's really the closest it is possible to get.
I guess the time I listen to notification will be close to transaction commit time but I am not sure how close it is.
"fairly". It will depend on network latency, CPU scheduling lag, etc. It sure won't be microsecond-accurate.
For example, on Windows it will be at best accurate to the millisecond, but by default it'll be accurate to the nearest 15-millisecond timer tick.
First, there is some time between polls in my code while listening(however small it is)
Don't poll. select()
the socket so you're woken the instant there's data to read. On Linux you'd ideally use the epoll()
system call for this.
and second, I am not sure if there is any lag between NOTIFY/LISTEN communication itself.
Some, yes, because a transaction commit takes time. So there's some non-zero time between when you issue the NOTIFY
and when the event is sent to listeners.
now() will always be less than then the actual time where the transaction is commited and visible to readers (unless you have dirty read as isolation level).
A better approach (one that does not have race-conditions) is call pg_notify() and use clock_timestamp() right after the end of the transaction (and only if the transaction commits).