I'm in the business of making website and applications that are not mission critical -> eg. banking software, space flight, intensive care monitoring application, etc. You get the idea.
So, with that massive disclaimer, is it bad using the NOLOCK hint in some Sql statement? A number of years ago, it was suggested by a fellow Sql Administrator that I should use NOLOCK if I'm happy with a "dirty read" which will give me a bit more performance out of my system because each read doesn't lock the table/row/whatever.
I was also told that it's a great solution if I'm experiencing dead-locks. So, I started following that thought for a few years until a Sql guru was helping me with some random code and noticed all the NOLOCKS in my sql code. I was politely scolded and he tried to explain it to me (why it's not a good thing) and I sorta got lost. I felt that the essence of his explanation was 'it's a band-aid solution to a more serious problem .. especially if you're experiencing deadlocking. As such, fix the root of the problem'.
I did some googling recently about it and came across this post.
So, can some sql db guru sensei's please enlighten me?
The better solutions, when possible are:
The SNAPSHOT transaction isolation level was created because MS was losing sales to Oracle. Oracle uses undo/redo logs to avoid this problem. Postgres uses MVCC. In the future MS's Heckaton will use MVCC, but that's years away from being production ready.
NOLOCK is often exploited as a magic way to speed up database reads, but I try to avoid using it whever possible.
The result set can contain rows that have not yet been committed, that are often later rolled back.
An error or Result set can be empty, be missing rows or display the same row multiple times.
This is because other transactions are moving data at the same time you're reading it.
READ COMMITTED adds an additional issue where data is corrupted within a single column where multiple users change the same cell simultaneously.
I believe that it is virtually never correct to use nolock.
If you are reading a single row, then the correct index means that you won't need NOLOCK as individual row actions are completed quickly.
If you are reading many rows for anything other than temporary display, and care about being able repeat the result, or defend by the number produced, then NOLOCK is not appropriate.
NOLOCK is a surrogate tag for "i don't care if this answer contains duplicate rows, rows which are deleted, or rows which were never inserted to begin with because of rollback"
Errors which are possible under NOLOCK:
Any action which can cause a page split while the noLock select is running can cause these things to occur. Almost any action (even a delete) can cause a page split.
Therefore: if you "know" that the row won't be changed while you are running, don't use nolock, as an index will allow efficient retrieval.
If you suspect the row can change while the query is running, and you care about accuracy, don't use nolock.
If you are considering NOLOCK because of deadlocks, examine the query plan structure for unexpected table scans, trace the deadlocks and see why they occur. NOLOCK around writes can mean that queries which previously deadlocked will potentially both write the wrong answer.
As a professional Developer I'd say it depends. But I definitely follow GATS and OMG Ponies advice. Know What You are doing, know when it helps and when it hurts and
read hints and other poor ideas
what might make You understand the sql server deeper. I generally follow the rule that SQL Hints are EVIL, but unfortunately I use them every now and then when I get fed up with forcing SQL server do things... But these are rare cases.
luke
Doubt it was a "guru" who'd had any experience in high traffic...
Websites are usually "dirty" by the time the person is viewing the completely loaded page. Consider a form that loads from the database and then saves the data that's edited?? It's idiotic the way people go on about dirty reads being such a no no.
That said, if you have a number of layers building on your selects, you could be building in a dangerous redundancy. If you're dealing in money or status scenarios, then you need not only transactional data read/writes, but a proper concurrency solution (something most "gurus" don't bother with).
On the other hand, if you have an advanced product search for a website (ie something that likely won't be cached and be a little intensive) and you've ever built a site with more than a few concurrent users (phenominal how many "experts" haven't), it is rediculous to bottle neck every other process behind it.
Know what it means and use it when appropriate. Your database will almost always be your main bottle neck these days and being smart about using NOLOCK can save you thousands in infrastructure.
EDIT: It's not just deadlocks it helps with, it's also how much you are going to make everybody else wait until you're finished, or vice versa.
Using NOLOCK Hint in EF4?
In real life where you encounter systems already written and adding indexes to tables then drastically slows down the data loading of a 14gig data table, you are sometime forced to used WITH NOLOCK on your reports and end of month proessing so that the aggregate funtions (sum, count etc) do not do row, page, table locking and deteriate the overall performance. Easy to say in a new system never use WITH NOLOCK and use indexes - but adding indexes severly downgrades data loading, and when I'm then told, well, alter the code base to delete indexes, then bulk load then recreate the indexes - which is all well and good, if you are developing a new system. But Not when you have a system already in place.