I have a table with unique constraint on it:
create table dbo.MyTab
(
MyTabID int primary key identity,
SomeValue nvarchar(50)
);
Create Unique Index IX_UQ_SomeValue
On dbo.MyTab(SomeValue);
Go
Which code is better to check for duplicates (success = 0 if duplicate found)?
Option 1
Declare @someValue nvarchar(50) = 'aaa'
Declare @success bit = 1;
Begin Try
Insert Into MyTab(SomeValue) Values ('aaa');
End Try
Begin Catch
-- lets assume that only constraint errors can happen
Set @success = 0;
End Catch
Select @success
Option 2
Declare @someValue nvarchar(50) = 'aaa'
Declare @success bit = 1;
IF EXISTS (Select 1 From MyTab Where SomeValue = @someValue)
Set @success = 0;
Else
Insert Into MyTab(SomeValue) Values ('aaa');
Select @success
From my point of view- i do believe that Try/Catch
is for errors, that were NOT expected (like deadlock or even constraints when duplicates are not expected). In this case- it is possible that sometimes a user will try to submit duplicate, so the error is expected.
I have found article by Aaron Bertrand that states- checking for duplicates is not much slower even if most of inserts are successful.
There is also loads of advices over the net to use Try/Catch (to avoid 2 statements not 1). In my environment there could be just like 1% of unsuccessful cases, so that kind of makes sense too.
What is your opinion? Whats other reasons to use option 1 OR option 2?
UPDATE: I'm not sure it is important in this case, but table have instead of update trigger (for audit purposes- row deletion also happens through Update statement).
I've seen that article but note that for low failure rates I'd prefer the "JFDI" pattern. I've used this on high volume systems before (40k rows/second).
In Aaron's code, you can still get a duplicate when testing first under high load and lots of writes. (explained here on dba.se) This is important: your duplicates still happen, just less often. You still need exception handling and knowing when to ignore the duplicate error (2627)
Edit: explained succinctly by Remus in another answer
However, I would have a separate TRY/CATCH to test only for the duplicate error
BEGIN TRY
-- stuff
BEGIN TRY
INSERT etc
END TRY
BEGIN CATCH
IF ERROR_NUMBER() <> 2627
RAISERROR etc
END CATCH
--more stuff
BEGIN CATCH
RAISERROR etc
END CATCH
To start with, the EXISTS(SELECT ...)
is incorrect as it fails under concurrency: multiple transactions could run the check concurrently and all conclude that they have to INSERT, one will be the the lucky winner that inserts first, all the rest will hit constraint violation. In other words you have a race condition between the check and the insert. So you will have to TRY/CATCH anyway, so better just try/catch.
Error logging
Don't hold me for this but there are likely logging implications when an exception is thrown. If you check before inserting no such thing happens.
Knowing why and when it can break
try/catch block should be used for parts that can break for non-deterministic reasons. I would say it's wiser in your case to check existing records because you know it can break and why exactly. So checking it yourself is from a developer's point of view a better way.
But in your code it may still break on insert because between the check time and insert time some other user inserted it already... But that is (as said previously) non-deterministic error. That's why you:
- should be checking with
exists
- inserting within
try/catch
Self explanatory code
Another positive is also that it is plain to see from the code why it can break while the try/catch block can hide that and one may remove them thinking why is this here, it's just inserting records...
Option - 3
Begin Try
SET XACT_ABORT ON
Begin Tran
IF NOT EXISTS (Select 1 From MyTab Where SomeValue = @someValue)
Begin
Insert Into MyTab(SomeValue) Values ('aaa');
End
Commit Tran
End Try
begin Catch
Rollback Tran
End Catch
Why not implement a INSTEAD OF INSERT
trigger on the table? You can check if the row exists, do nothing if it does, and insert the row if it doesn't.