I'm trying to write a query that updates a row in the "claim" table to an active status only if a user doesn't have more than two active claims open already. So it's very important for data integrity that a user never has more than two active claims open at any given time.
I'm running this query in a concurrent environment, so it's possible that two processes execute this query at the same time. I'm also running it under the default Read Committed
isolation level.
I'm wondering if I'm open to the possibility that a user could have more than two active claims open at some point, due to a race condition between the subselect and the update clause.
On the same note, performance isn't nearly as important for this query as data integrity is.
update claim
set is_active = '1'
where claim.id = %s
and (2 > (select count(*)
from claim as active_claim
where active_claim.user_id = %s
and active_claim.is_active = '1'))
Yes, it's absolutely possible that this could result in more than two active claims, because concurrent transactions can't see each others' changes, so two or more concurrent executions would both see 2 claims and both proceed to update their target claims to make them active.
See related: Do database transactions prevent race conditions.
Table lock
The simplest option is to simply:
BEGIN;
LOCK TABLE claim IN EXCLUSIVE MODE;
UPDATE ...
COMMIT;
... but that's a pretty heavy-weight solution.
Row-level lock on a user object
Assuming you have a table user
for the owner of the claims, you should instead:
SELECT 1 FROM user WHERE user_id = whatever FOR UPDATE
in the same transaction, before running your UPDATE
. That way you'll hold an exclusive row-lock on the user and other SELECT ... FOR UPDATE
statements will block on your lock. This lock will also block UPDATE
s to and deletes of the user
; it will not block plain SELECT
s of the user without a FOR UPDATE
or FOR SHARE
clause.
See explicit locking in the PostgreSQL manual.
SERIALIZABLE
isolation
An alternative is to use SERIALIZABLE
isolation; PostgreSQL 9.2 and newer have transaction dependency detection that would cause all but one of the conflicting transaction to abort with a serialization failure in the example you give above. So your app has to remember what it tried to do when it starts a transaction and be able to trap errors, detect that they're serialization failures, and re-try it after a serialization failure.
See transaction isolation in the PostgreSQL manual.
Advisory locks
Sometimes there's no good candidate object to take a row lock on, and for some reason or another serializable isolation won't solve the issue or isn't usable for other reasons. That's not the case for you, this is just for general information.
In such cases you can use PostgreSQL's advisory locks to lock arbitrary numeric values; in this case you'd pg_advisory_xact_lock(active_claim.user_id)
for example. The explicit locking chapter has more information.