I'm working on a huge SQL code and unfortunately it has a CURSOR which handles another two nested CURSORS within it (totally three cursors inside a stored procedure), which handles millions of data to be DELETE,UPDATE and INSERT. This takes a whole lot of time because of row by row execution and I wish to modify this in to SET based approach
From many articles it shows use of CURSORs is not recommend and the alternate is to use WHILE loops instead, So I tried and replaced the three CUROSRs with three WHILE loops nothing more, though I get the same result but there is no improvement in performance, it took the same time as it took for CUROSRs.
Below is the basic structure of the code I'm working on (i Will try to put as simple as possible) and I will put the comments what they are supposed to do.
declare @projects table (
ProjectID INT,
fieldA int,
fieldB int,
fieldC int,
fieldD int)
INSERT INTO @projects
SELECT ProjectID,fieldA,fieldB,fieldC, fieldD
FROM ProjectTable
DECLARE projects1 CURSOR LOCAL FOR /*First cursor - fetch the cursor from ProjectaTable*/
Select ProjectID FROM @projects
OPEN projects1
FETCH NEXT FROM projects1 INTO @ProjectID
WHILE @@FETCH_STATUS = 0
BEGIN
BEGIN TRY
BEGIN TRAN
DELETE FROM T_PROJECTGROUPSDATA td
WHERE td.ID = @ProjectID
DECLARE datasets CURSOR FOR /*Second cursor - this will get the 'collectionDate'field from datasetsTable for every project fetched in above cursor*/
Select DataID, GroupID, CollectionDate
FROM datasetsTable
WHERE datasetsTable.projectID = @ProjectID /*lets say this will fetch ten records for a single projectID*/
OPEN datasets
FETCH NEXT FROM datasets INTO @DataID, @GroupID, @CollectionDate
WHILE @@FETCH_STATUS = 0
BEGIN
DECLARE period CURSOR FOR /*Third Cursor - this will process the records from another table called period with above fetched @collectionDate*/
SELECT ID, dbo.fn_GetEndOfPeriod(ID)
FROM T_PERIODS
WHERE DATEDIFF(dd,@CollectionDate,dbo.fn_GetEndOfPeriod(ID)) >= 0 /*lets say this will fetch 20 records for above fetched single @CollectionDate*/
ORDER BY [YEAR],[Quarter]
OPEN period
FETCH NEXT FROM period INTO @PeriodID, @EndDate
WHILE @@FETCH_STATUS = 0
BEGIN
IF EXISTS (some conditions No - 1 )
BEGIN
BREAK
END
IF EXISTS (some conditions No - 2 )
BEGIN
FETCH NEXT FROM period INTO @PeriodID, @EndDate
CONTINUE
END
/*get the appropirate ID from T_uploads table for the current projectID and periodID fetched*/
SET @UploadID = (SELECT ID FROM T_UPLOADS u WHERE u.project_savix_ID = @ProjectID AND u.PERIOD_ID = @PeriodID AND u.STATUS = 3)
/*Update some fields in T_uploads table for the current projectID and periodID fetched*/
UPDATE T_uploads
SET fieldA = mp.fieldA, fieldB = mp.fieldB
FROM @projects mp
WHERE T_UPLOADS.ID = @UploadID AND mp.ProjectID = @ProjectID
/*Insert some records in T_PROJECTGROUPSDATA table for the current projectID and periodID fetched*/
INSERT INTO T_PROJECTGROUPSDATA tpd ( fieldA,fieldB,fieldC,fieldD,uploadID)
SELECT fieldA,fieldB,fieldC,fieldD,@UploadID
FROM @projects
WHERE tpd.DataID = @DataID
FETCH NEXT FROM period INTO @PeriodID, @EndDate
END
CLOSE period
DEALLOCATE period
FETCH NEXT FROM datasets INTO @DataID, @GroupID, @CollectionDate, @Status, @Createdate
END
CLOSE datasets
DEALLOCATE datasets
COMMIT
END TRY
BEGIN CATCH
Error handling
IF @@TRANCOUNT > 0
ROLLBACK
END CATCH
FETCH NEXT FROM projects1 INTO @ProjectID, @FAID
END
CLOSE projects1
DEALLOCATE projects1
SELECT 1 as success
I request you to suggest any methods to rewrite this code to follow the SET based approach.
Until the table structure and expected result sample data is not provided, here are a few quick things I see that can be improved (some of those are already mentioned by others above):
Yes, I agree that having a SET based approach would be the fastest in most cases, however if you must store intermediate resultset somewhere, I would suggest using a temp table instead of a table variable. Temp table is 'lesser evil' between these 2 options. Here are a few reason why you should try to avoid using a table variable:
I think all the cursors code above can be simplified to something like this:
Also it would be nice to know "some conditions No" details as query can differ depends on that.
Replacing
Cursor
withWhile
blindly, is not a recommended option, hence it would not impact your performance and might even have negative impact on the performance.When you define the cursor using
Declare C Cursor
in fact you are going to create aSCROLL
cursor which specifies that all fetch options (FIRST, LAST, PRIOR, NEXT, RELATIVE, ABSOLUTE
) are available.When you need just
Fetch Next
as scroll option, you can declare the cursor asFAST_FORWARD
Here is the quote about
FAST_FORWARD
cursor in Microsoft docs:So you can declare your cursors using
DECLARE <CURSOR NAME> FAST_FORWARD FOR ...
and you will get noticeable improvements