How do you stop race conditions in MySQL? the problem at hand is caused by a simple algorithm:
- select a row from table
- if it doesn't exist, insert it
and then either you get a duplicate row, or if you prevent it via unique/primary keys, an error.
Now normally I'd think transactions help here, but because the row doesn't exist, the transaction don't actually help (or am I missing something?).
LOCK TABLE sounds like an overkill, especially if the table is updated multiple times per second.
The only other solution I can think of is GET_LOCK() for every different id, but isn't there a better way? Are there no scalability issues here as well? And also, doing it for every table sounds a bit unnatural, as it sounds like a very common problem in high-concurrency databases to me.
I ran into the same problem and searched the Net for a moment :)
Finally I came up with solution similar to method to creating filesystem objects in shared (temporary) directories to securely open temporary files:
Don't be afraid of busy-loop - normally it will execute once or twice.
On a technical level, a transaction will help here because other threads won't see the new row until you commit the transaction.
But in practice that doesn't solve the problem - it only moves it. Your application now needs to check whether the commit fails and decide what to do. I would normally have it rollback what you did, and restart the transaction because now the row will be visible. This is how transaction-based programmer is supposed to work.
It seems to me you should have a unique index on your id column, so a repeated insert would trigger an error instead of being blindingly accepted again.
That can be done by defining the id as a primary key or using a unique index by itself.
I think the first question you need to ask is why do you have many threads doing the exact SAME work? Why would they have to insert the exact same row?
After that being answered, I think that just ignoring the errors will be the most performant solution, but measure both approaches (GET_LOCK v/s ignore errors) and see for yourself.
There is no other way that I know of. Why do you want to avoid errors? You still have to code for the case when another type of error occurs.
As staticsan says transactions do help but, as they usually are implied, if two inserts are ran by different threads, they will both be inside an implied transactions and see consistent views of the database.
what you want is LOCK TABLES
or if that seems excessive how about INSERT IGNORE with a check that the row was actually inserted.
I have a similar issue. I have a table that under most circumstances should have a unique ticket_id value, but there are some cases where I will have duplicates; not the best design, but it is what it is.
User B has reserved the ticket, User A reports back that the ticket has been taken by someone else.
The key in my instance is that you need a tie-breaker, in my case it's the auto-increment id on the row.
Locking the entire table is indeed overkill. To get the effect that you want, you need something that the litterature calls "predicate locks". No one has ever seen those except printed on the paper that academic studies are published on. The next best thing are locks on the "access paths" to the data (in some DBMS's : "page locks").
Some non-SQL systems allow you to do both (1) and (2) in one single statement, more or less meaning the potential race conditions arising from your OS suspending your execution thread right between (1) and (2), are entirely eliminated.
Nevertheless, in the absence of predicate locks such systems will still need to resort to some kind of locking scheme, and the finer the "granularity" (/"scope") of the locks it takes, the better for concurrency.
(And to conclude : some DBMS's - especially the ones you don't have to pay for - do indeed offer no finer lock granularity than "the entire table".)