Update where race conditions Postgres (read commit

2019-04-21 06:57发布

问题:

I'm trying to write a query that updates a row in the "claim" table to an active status only if a user doesn't have more than two active claims open already. So it's very important for data integrity that a user never has more than two active claims open at any given time.

I'm running this query in a concurrent environment, so it's possible that two processes execute this query at the same time. I'm also running it under the default Read Committed isolation level.

I'm wondering if I'm open to the possibility that a user could have more than two active claims open at some point, due to a race condition between the subselect and the update clause.

On the same note, performance isn't nearly as important for this query as data integrity is.

update claim
set is_active = '1'
where claim.id = %s
and (2 > (select count(*)
          from claim as active_claim
          where active_claim.user_id = %s
          and active_claim.is_active = '1'))

回答1:

Yes, it's absolutely possible that this could result in more than two active claims, because concurrent transactions can't see each others' changes, so two or more concurrent executions would both see 2 claims and both proceed to update their target claims to make them active.

See related: Do database transactions prevent race conditions.

Table lock

The simplest option is to simply:

BEGIN;
LOCK TABLE claim IN EXCLUSIVE MODE;
UPDATE ...
COMMIT;

... but that's a pretty heavy-weight solution.

Row-level lock on a user object

Assuming you have a table user for the owner of the claims, you should instead:

SELECT 1 FROM user WHERE user_id = whatever FOR UPDATE

in the same transaction, before running your UPDATE. That way you'll hold an exclusive row-lock on the user and other SELECT ... FOR UPDATE statements will block on your lock. This lock will also block UPDATEs to and deletes of the user; it will not block plain SELECTs of the user without a FOR UPDATE or FOR SHARE clause.

See explicit locking in the PostgreSQL manual.

SERIALIZABLE isolation

An alternative is to use SERIALIZABLE isolation; PostgreSQL 9.2 and newer have transaction dependency detection that would cause all but one of the conflicting transaction to abort with a serialization failure in the example you give above. So your app has to remember what it tried to do when it starts a transaction and be able to trap errors, detect that they're serialization failures, and re-try it after a serialization failure.

See transaction isolation in the PostgreSQL manual.

Advisory locks

Sometimes there's no good candidate object to take a row lock on, and for some reason or another serializable isolation won't solve the issue or isn't usable for other reasons. That's not the case for you, this is just for general information.

In such cases you can use PostgreSQL's advisory locks to lock arbitrary numeric values; in this case you'd pg_advisory_xact_lock(active_claim.user_id) for example. The explicit locking chapter has more information.