From the documentation for find_or_create
:
Note: Because find_or_create() reads from the database and then
possibly inserts based on the result, this method is subject to a race
condition. Another process could create a record in the table after
the find has completed and before the create has started. To avoid
this problem, use find_or_create() inside a transaction.
Is it enough to just use find_or_create()
inside a transaction in PostgreSQL?
No, the documentation is incorrect. Using a transaction alone does not avoid this problem. It only guarantees that the whole transaction is rolled back if an exception should occur - so that no inconsistent state will be persisted to the database.
To avoid this problem you must lock the table - inside a transaction, because all locks are released at the end of a transaction. Something like:
BEGIN;
LOCK TABLE mytbl IN SHARE MODE;
-- do your find_or_create here
COMMIT;
But that's not a magic cure for everything. It can become a performance problem, and there may be deadlocks (concurrent transactions mutually trying to lock resources that the other one has locked already). PostgreSQL will detect such a condition and cancel all but one of the competing transactions. You must be prepared to retry the operation on failure.
The PostgreSQL manual about locks.
If you don't have a lot of concurrency you might also just ignore the problem. The time slot is very tiny so it only very rarely actually happens. If you catch the duplicate key violation error, which will do no harm, then you have covered this, too.
This implementation of find_or_create
should prevent the race condition, described in the OP:
eval {
$row = $self->model->create( { ... } );
}
if($@ && $@ =~ /duplicate/i) {
$row = $self->model->find( { ... } );
}
It also reduces find_or_create()
to a single query in the best case.