I'm writing a process that archives rows from a SQL Server table based on a datetime column. I want to move all the rows with a date before X, but the problem is that there are millions of rows for each date, so doing a BEGIN TRANSACTION...INSERT...DELETE...COMMIT for each date takes too long, and locks up the database for other users.
Is there a way that I can do it in smaller chunks? Maybe using ROWCOUNT or something like that?
I'd originally considered something like this:
SET ROWCOUNT 1000
DECLARE @RowsLeft DATETIME
DECLARE @ArchiveDate DATETIME
SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)
WHILE @ROWSLEFT IS NOT NULL
BEGIN
INSERT INTO EventsBackups
SELECT top 1000 * FROM Events
DELETE Events
SET @ROWSLEFT = (SELECT TOP 1 dtcol FROM Events WHERE dtcol <= @ArchiveDate)
END
But then I realized that I can't guarantee that the rows I'm deleting are the ones I just backed up. Or can I...?
UPDATE: Another options I'd considered was adding a step:
- SELECT TOP 1000 rows that meet my date criteria into a temp table
- Begin Transaction
- Insert from temp table into archive table
- Delete from source table, joining to temp table across every column
- Commit transaction
- Repeat 1-5 until no rows remain that meet the date criteria
Does anybody have an idea for how the expense of this series might compare to some of the other options discussed below?
DETAIL: I'm using SQL 2005, since somebody asked.
use a INSERT with an OUTPUT INTO clause to store the IDs of the inserted rows, then DELETE joining to this temp table to remove only those IDs
How about don't do it all at once?
Then later,
or the equivalent.
Nothing you've said so far suggests you need a transaction.