Delete rows of data in batches

2019-07-22 19:43发布

问题:

I'm new to this so please be gentle. I need to delete millions of rows of data in two tables. When I tried this with a sql script,the log file got so big that it took up all of hard drive and did not complete. I was reading a few articles online that said, if the data was deleted in batches that the log file would not be affected in the same way. The database in currently in simple mode and is always kept that way. this is the script I use to delete the data.

Delete from EligibilityInformation Where DateEntered <= DateAdd(day,-31, getdate())
Delete from EligibilityRequestLog Where ActualPostingDate <= DateAdd(day,-31, getdate())

Can someone help me with a script that I can add as a SQL Job that will delete 10,000 rows of data at a time until all the rows have been deleted?I found the following script online. It may be more than I need. I added my SQL script to it.

DECLARE @continue INT
DECLARE @rowcount INT

SET @continue = 1
WHILE @continue = 1
BEGIN
    PRINT GETDATE()
    SET ROWCOUNT 10000
    BEGIN TRANSACTION
    Delete from EligibilityInformation Where DateEntered <= DateAdd(day,-31, getdate())
    Delete from EligibilityRequestLog Where ActualPostingDate <= DateAdd(day,-31, getdate())
    SET @rowcount = @@rowcount 
    COMMIT
    PRINT GETDATE()
    IF @rowcount = 0
    BEGIN
        SET @continue = 0
    END
END

回答1:

WHILE EXISTS(SELECT * FROM EligibilityInformation WHERE DateEntered <= DATEADD(DAY, -31, GETDATE()))
BEGIN
    PRINT GETDATE()
    DELETE (TOP 10000) FROM EligibilityInformation WHERE DateEntered <= DATEADD(DAY, -31, GETDATE())
    PRINT GETDATE()
END

WHILE EXISTS(SELECT * FROM EligibilityRequestLog WHERE ActualPostingDate <= DATEADD(DAY, -31, GETDATE()))
BEGIN
    PRINT GETDATE()
    DELETE (TOP 10000) FROM EligibilityRequestLog WHERE ActualPostingDate <= DATEADD(DAY, -31, GETDATE())
    PRINT GETDATE()
END


回答2:

I am assuming you have a primary key. This should work for you:

Delete Top 10000  
    From EligibilityInformation Where DateEntered <= DateAdd(day,-31, getdate())

Here is an article on why you might choose Top over Set Rowcount https://sqlstudies.com/2013/10/07/use-top-instead-of-set-rowcount/



回答3:

I am not at the server right now to test this syntax but this is something I use

select 1  -- to get a @@rowcount > 0 
while (@@rowcount > 0) 
begin 
   delete (top 1000) from table where ...
end  

Putting the two in a transaction is just a LOT of overhead where I see no value. At best you would roll back a single set (top). If you tired to wrap the loop in single transaction then doing it in sets adds nothing.

the script you have now is nothing more than

BEGIN TRANSACTION
Delete from EligibilityInformation Where DateEntered <= DateAdd(day,-31, getdate())
Delete from EligibilityRequestLog Where ActualPostingDate <= DateAdd(day,-31, getdate())
COMMIT