Do databases always lock non-existent rows after a

2020-02-06 18:09发布

Given:

customer[id BIGINT AUTO_INCREMENT PRIMARY KEY, email VARCHAR(30), count INT]

I'd like to execute the following atomically: Update the customer if he already exists; otherwise, insert a new customer.

In theory this sounds like a perfect fit for SQL-MERGE but the database I am using doesn't support MERGE with AUTO_INCREMENT columns.

https://stackoverflow.com/a/1727788/14731 seems to indicate that if you execute a query or update statement against a non-existent row, the database will lock the index thereby preventing concurrent inserts.

Is this behavior guaranteed by the SQL standard? Are there any databases that do not behave this way?

UPDATE: Sorry, I should have mentioned this earlier: the solution must use READ_COMMITTED transaction isolation unless that is impossible in which case I will accept the use of SERIALIZABLE.

4条回答
Ridiculous、
2楼-- · 2020-02-06 18:29

Answering my own question since there seems to be a lot of confusion around the topic. It seems that:

-- BAD! DO NOT DO THIS! --
insert customer (email, count) 
select 'foo@example.com', 0
where not exists (
      select 1 from customer
      where email = 'foo@example.com'
)

is open to race-conditions (see Only inserting a row if it's not already there). From what I've been able to gather, the only portable solution to this problem:

  1. Pick a key to merge against. This could be the primary key, or another unique key, but it must have a unique constraint.
  2. Try to insert a new row. You must catch the error that will occur if the row already exists.
  3. The hard part is over. At this point, the row is guaranteed to exist and you are protected from race-conditions by the fact that you are holding a write-lock on it (due to the insert from the previous step).
  4. Go ahead and update if needed or select its primary key.
查看更多
3楼-- · 2020-02-06 18:30

This question is asked about once a week on SO, and the answers are almost invariably wrong.

Here's the right one.

insert customer (email, count) 
select 'foo@example.com', 0
where not exists (
      select 1 from customer
      where email = 'foo@example.com'
)

update customer set count = count + 1
where email = 'foo@example.com'

If you like, you can insert a count of 1 and skip the update if the inserted rowcount -- however expressed in your DBMS -- returns 1.

The above syntax is absolutely standard and makes no assumption about locking mechanisms or isolation levels. If it doesn't work, your DBMS is broken.

Many people are under the mistaken impression that the select executes "first" and thus introduces a race condition. No: that select is part of the insert. The insert is atomic. There is no race.

查看更多
家丑人穷心不美
4楼-- · 2020-02-06 18:36
IF EXISTS (SELECT 1 FROM foo WHERE [email] = 'thisemail')
BEGIN
    UPDATE foo...
END
ELSE
BEGIN
    INSERT INTO foo...
END
查看更多
萌系小妹纸
5楼-- · 2020-02-06 18:47

Use Russell Fox's code but use SERIALIZABLE isolation. This will take a range lock so that the non-existing row is logically locked (together with all other non-existing rows in the surrounding key range).

So it looks like this:

BEGIN TRAN
IF EXISTS (SELECT 1 FROM foo WITH (UPDLOCK, HOLDLOCK) WHERE [email] = 'thisemail')
BEGIN
    UPDATE foo...
END
ELSE
BEGIN
    INSERT INTO foo...
END
COMMIT

Most code taken from his answer, but fixed to provided mutual exclusion semantics.

查看更多
登录 后发表回答