I have been reading thoroughly the postgres documentation on transaction isolation suggested in other of my questions but I have not still managed to understand the "predicate locking" stuff.
I hope somebody can enlighten me :-)
According to the documentation: Predicate locks in PostgreSQL, like in most other database systems, are based on data actually accessed by a transaction
That sounds good, then why is the following happening?
CREATE TABLE mycustomer(cid integer PRIMARY KEY, licenses integer);
CREATE TABLE mydevice(id integer PRIMARY KEY, cid integer REFERENCES
mycustomer (cid), status varchar(10));
INSERT INTO mycustomer(cid, licenses) VALUES (1, 5);
INSERT INTO mycustomer(cid, licenses) VALUES (2, 5);
Request 1 Request2
BEGIN TRANSACTION ISOLATION
LEVEL SERIALIZABLE;
BEGIN TRANSACTION ISOLATION
LEVEL SERIALIZABLE;
SELECT * from mydevice where cid = 1;
SELECT * from mydevice where cid = 2;
INSERT INTO mydevice(id, cid, status)
VALUES (1, 1, 'ok');
INSERT INTO mydevice(id, cid, status)
VALUES (2, 2, 'ok');
commit;
(=ok)
commit;
(=rollback)
I understand that the inserts from request 1 and request 2 are not conflicting with the previous reads and thus there should not be any error launched. Why am I getting a "ERROR: could not serialize access due to read/write dependencies among transactions".
As you can imagine I cannot have the aforementioned behavior happening since every concurrent request would be roll-backed regardless of its details. In my business scenario I would like concurrent requests to be only roll-backed when they were inserting data (as per the example devices) for the same single customer.
These operations are performed from a Java application and In principle I am thinking about creating a locking table to satisfy my needs. Any ideas?
Many thanks!
For those more curious,in Postgres 9.1 Source code, if you look at
src/backend/storage/lmgr/README-SSI
, there is lot of detailed description about Predicate locking and Serializable transactionsHere's a snippet of the same:
From the Transaction Isolation page:
An
EXPLAIN
on thatSELECT
can tel you what the query plan is being taken, but if the table is small (or empty!), PostgreSQL will almost certainly pick a sequential scan instead of referencing the index. This will cause a predicate lock on the entire table, causing serialization failure whenever another transaction does anything to the table.On my system:
You could try adding an index and force it to use that:
However, this is not the correct solution. Let's back up a little bit.
Serializable is meant to guarantee that transactions will have exactly the same effect as if they were run one after another, despite the fact that you're actually running these transactions concurrently. PostgreSQL does not have infinite resources, so while it's true that it puts predicate locks on data that your query actually accesses, "data" can mean more than "rows returned".
PostgreSQL chooses to flag serialization failures when it thinks there might be a problem, not when it's certain. (Hence how it generalizes row locks to page locks.) This design choice cause false positives, such as the one in your example. False positives are less than ideal, however, it doesn't affect correctness of the isolation semantics.
The error message is:
That hint is key. Your application needs to catch serialization failures and retry the whole operation. This is true whenever
SERIALIZABLE
is in play -- it guarantees serial correctness despite concurrency, but it can't do that without the help of your application. Put another way, if you're actually doing concurrent modifications, the only way PostgreSQL can satisfy the isolation requirements is to ask your application to serialize itself. Thus: