Question
What is the benefit of applying locks to the below statement?
Similarly, what issue would we see if we didn't include these hints? i.e. Do they prevent a race condition, improve performance, or maybe something else? Asking as perhaps they're included to prevent some issue I've not considered rather than the race condition I'd assumed.
NB: This is an overflow from a question asked here: SQL Threadsafe UPDATE TOP 1 for FIFO Queue
The Statement In Question
WITH nextRecordToProcess AS
(
SELECT TOP(1) Id, StatusId
FROM DemoQueue
WHERE StatusId = 1 --Ready for processing
ORDER BY DateSubmitted, Id
)
UPDATE nextRecordToProcess
SET StatusId = 2 --Processing
OUTPUT Inserted.Id
Requirement
- The SQL is used to retrieve an unprocessed record from a queue.
- The record to be obtained should be the first record in the queue with status Ready (StatusId = 1).
- There may be multiple workers/sessions processing messages from this queue.
- We want to ensure that each record in the queue is only picked up once (i.e. by a single worker), and that each worker processes messages in the order in which they appear in the queue.
- It's OK for one worker to work faster than another (i.e. if Worker A picks up record 1 then Worker B picks up record 2 it's OK if worker B completes the processing of record 2 before Worker A has finished processing record 1). We're only concerned within the context of picking up the record.
- There's no ongoing transaction; i.e. we just want to pick up the record from the queue; we don't need to keep it locked until we come back to progress the status from
Processing
toProcessed
.
Additional SQL for Context:
CREATE TABLE Statuses
(
Id SMALLINT NOT NULL PRIMARY KEY CLUSTERED
, Name NVARCHAR(32) NOT NULL UNIQUE
)
GO
INSERT Statuses (Id, Name)
VALUES (0,'Draft')
, (1,'Ready')
, (2,'Processing')
, (3,'Processed')
, (4,'Error')
GO
CREATE TABLE DemoQueue
(
Id BIGINT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED
, StatusId SMALLINT NOT NULL FOREIGN KEY REFERENCES Statuses(Id)
, DateSubmitted DATETIME --will be null for all records with status 'Draft'
)
GO
Suggested Statement
In the various blogs discussing queues, and in the question which caused this discussion, it's suggested that the above statement be changed to include lock hints as below:
WITH nextRecordToProcess AS
(
SELECT TOP(1) Id, StatusId
FROM DemoQueue WITH (UPDLOCK, ROWLOCK, READPAST)
WHERE StatusId = 1 --Ready for processing
ORDER BY DateSubmitted, Id
)
UPDATE nextRecordToProcess
SET StatusId = 2 --Processing
OUTPUT Inserted.Id
My Understanding
I understand that were locking required the benefits of these hints would be:
- UPDLOCK: Because we're selecting the record to update it's status we need to ensure that any other sessions reading this record after we've read it but before we've updated it won't be able to read the record with the intent to update it (or rather, such a statement would have to wait until we've performed our update and released the lock before the other session could see our record with its new value).
- ROWLOCK: Whilst we're locking the record, we want to ensure that our lock only impacts the row we're locking; i.e. as we don't need to lock many resources / we don't want to impact other processes / we want other sessions to be able to read the next available item in the queue even if that item's in the same page as our locked record.
- READPAST: If another session is already reading an item from the queue, rather than waiting for that session to release it's lock, our session should pick the next available (not locked) record in the queue.
i.e. Were we running the below code I think this would make sense:
DECLARE @nextRecordToProcess BIGINT
BEGIN TRANSACTION
SELECT TOP (1) @nextRecordToProcess = Id
FROM DemoQueue WITH (UPDLOCK, ROWLOCK, READPAST)
WHERE StatusId = 1 --Ready for processing
ORDER BY DateSubmitted, Id
--and then in a separate statement
UPDATE DemoQueue
SET StatusId = 2 --Processing
WHERE Id = @nextRecordToProcess
COMMIT TRANSACTION
--@nextRecordToProcess is then returned either as an out parameter or by including a `select @nextRecordToProcess Id`
However when the select and update occur in the same statement I'd have assumed that no other session could read the same record between our session's read & update; so there'd be no need for explicit lock hints.
Have I misunderstood something fundamentally with how locks work; or is the suggestion for these hints related to some other similar but different use case?