CURSOR vs. UPDATE

2019-08-04 09:09发布

问题:

A company uses a SQL Server database to store information about its customers and its business transactions. A new area code has been introduced for your city. The area code 111 remains the same for telephone numbers with prefixes that are less than 500. The numbers with prefixes that are 500 and greater will be assigned an area code of 222. All telephone numbers in the Phone column in the Customers table are stored as char(12) strings of the following format, ‘999-999-9999’. i must make the appropriate changes to the Customers table

as quickly as possible using the least administrative effort. Which one should I use ?

a.

UPDATE Customers SET Phone = ‘222-‘ + SUBSTRING(Phone,5,8) 
FROM Customers WHERE SUBSTRING(Phone,1,3) = ‘111’ 
AND SUBSTRING(Phone,5,3) >= 500 

b.

DECLARE PhoneCursor CURSOR FOR 
   SELECT Phone FROM Customers 
   WHERE SUBSTRING(Phone,1,3) = 111 
     AND SUBSTRING(Phone,5,3) >= 500 

OPEN PhoneCursor 
FETCH NEXT FROM PhoneCursor 

WHILE @@FETCH_STATUS = 0 
BEGIN 
    UPDATE Customers 
    SET Phone = ‘222’ + SUBSTRING(Phone,5,8) 
    WHERE CURRENT OF PhoneCursor 

    FETCH NEXT FROM PhoneCursor 
END 

CLOSE PhoneCursor 
DEALLOCATE PhoneCursor 

回答1:

The big update will hold a transaction against the database for, potentially, a long time... locking things up and causing all kinds of havoc.

For this, I would recommend a cursor to spread that load out over a period of time.

I've also done a 'chunked' update... something like this:

DECLARE @Done bit = 0
WHILE @Done = 0
BEGIN
    UPDATE TOP(10000)
        Customers SET Phone = ‘222-‘ + SUBSTRING(Phone,5,8) 
    FROM Customers WHERE SUBSTRING(Phone,1,3) = ‘111’ 
    AND SUBSTRING(Phone,5,3) >= 500 

    IF @@ROWCOUNT = 0
    BEGIN
        SET @Done = 1
    END
END


回答2:

The cursor would be far slower in any large dataset, talk about hours vice seconds or milliseconds. What it will do is not lock up the database from other users for as long at any one time.

This is, why with a large dataset, the batch approach can be best.

In general I would try the set-based approach first and run it on off hours if need be. Then I would try the set-based approach which runs a batch of the records only if the set-based is too slow.

If you have to go the cursor running one record at time, then there is probably something massively wrong with your database design. Cursors are generally the approach of last resort. Do not use them for inserts/updates or deletes.