Assume a table structure of MyTable(KEY, datafield1, datafield2...)
.
Often I want to either update an existing record, or insert a new record if it doesn't exist.
Essentially:
IF (key exists)
run update command
ELSE
run insert command
What's the best performing way to write this?
If you want to UPSERT more than one record at a time you can use the ANSI SQL:2003 DML statement MERGE.
Check out Mimicking MERGE Statement in SQL Server 2005.
You can use
MERGE
Statement, This statement is used to insert data if not exist or update if does exist.In SQL Server 2008 you can use the MERGE statement
Many people will suggest you use
MERGE
, but I caution you against it. By default, it doesn't protect you from concurrency and race conditions any more than multiple statements, but it does introduce other dangers:http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
Even with this "simpler" syntax available, I still prefer this approach (error handling omitted for brevity):
A lot of folks will suggest this way:
But all this accomplishes is ensuring you may need to read the table twice to locate the row(s) to be updated. In the first sample, you will only ever need to locate the row(s) once. (In both cases, if no rows are found from the initial read, an insert occurs.)
Others will suggest this way:
However, this is problematic if for no other reason than letting SQL Server catch exceptions that you could have prevented in the first place is much more expensive, except in the rare scenario where almost every insert fails. I prove as much here:
don't forget about transactions. Performance is good, but simple (IF EXISTS..) approach is very dangerous.
When multiple threads will try to perform Insert-or-update you can easily get primary key violation.
Solutions provided by @Beau Crawford & @Esteban show general idea but error-prone.
To avoid deadlocks and PK violations you can use something like this:
or
Do an UPSERT:
http://en.wikipedia.org/wiki/Upsert