How to efficiently delete rows while NOT using Tru

2019-01-10 17:58发布

Let's say we have table Sales with 30 columns and 500,000 rows. I would like to delete 400,000 in the table (those where "toDelete='1'").

But I have a few constraints :

  • the table is read / written "often" and I would not like a long "delete" to take a long time and lock the table for too long
  • I need to skip the transaction log (like with a TRUNCATE) but while doing a "DELETE ... WHERE..." (I need to put a condition), but haven't found any way to do this...

Any advice would be welcome to transform a

DELETE FROM Sales WHERE toDelete='1'

to something more partitioned & possibly transaction log free.

7条回答
啃猪蹄的小仙女
2楼-- · 2019-01-10 17:59

You should try to give it a ROWLOCK hint so it will not lock the entire table. However, if you delete a lot of rows lock escalation will occur.

Also, make sure you have a non-clustered filtered index (only for 1 values) on the toDelete column. If possible make it a bit column, not varchar (or what it is now).

DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'

Ultimately, you can try to iterate over the table and delete in chunks.

Updated

Since while loops and chunk deletes are the new pink here, I'll throw in my version too (combined with my previous answer):

SET ROWCOUNT 100
DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'

WHILE @@rowcount > 0
BEGIN
  SET ROWCOUNT 100
  DELETE FROM Sales WITH(ROWLOCK) WHERE toDelete='1'  
END
查看更多
劫难
3楼-- · 2019-01-10 18:03

My own take on this functionality would be as follows. This way there is no repeated code and you can manage your chunk size.

DECLARE @DeleteChunk INT = 10000
DECLARE @rowcount INT = 1

WHILE @rowcount > 0
BEGIN

  DELETE TOP (@DeleteChunk) FROM Sales WITH(ROWLOCK)

  SELECT @rowcount = @@RowCount
END
查看更多
We Are One
4楼-- · 2019-01-10 18:06

Calling DELETE FROM TableName will do the entire delete in one large transaction. This is expensive.

Here is another option which will delete rows in batches :

deleteMore:
DELETE TOP(10000) Sales WHERE toDelete='1'
IF @@ROWCOUNT != 0
    goto deleteMore
查看更多
Rolldiameter
5楼-- · 2019-01-10 18:10

As I assume the best way to delete huge amount of records is to delete it by Primary Key. (What is Primary Key see here)

So you have to generate tsql script that contains the whole list of lines to delete and after this execute this script.

For example code below is gonna generate that file

GO
SET NOCOUNT ON

SELECT   'DELETE FROM  DATA_ACTION WHERE ID = ' + CAST(ID AS VARCHAR(50)) + ';' + CHAR(13) + CHAR(10) + 'GO'
FROM    DATA_ACTION
WHERE  YEAR(AtTime) = 2014

The ouput file is gonna have records like

DELETE FROM  DATA_ACTION WHERE ID = 123;
GO
DELETE FROM  DATA_ACTION WHERE ID = 124;
GO
DELETE FROM  DATA_ACTION WHERE ID = 125;
GO

And now you have to use SQLCMD utility in order to execute this script.

sqlcmd -S [Instance Name] -E -d [Database] -i [Script]

You can find this approach explaned here https://www.mssqltips.com/sqlservertip/3566/deleting-historical-data-from-a-large-highly-concurrent-sql-server-database-table/

查看更多
孤傲高冷的网名
6楼-- · 2019-01-10 18:14

One way I have had to do this in the past is to have a stored procedure or script that deletes n records. Repeat until done.

DELETE TOP 1000 FROM Sales WHERE toDelete='1'
查看更多
smile是对你的礼貌
7楼-- · 2019-01-10 18:15

What you want is batch processing.

While (select Count(*) from sales where toDelete =1) >0
BEGIN
Delete from sales where SalesID in
(select top 1000 salesId from sales where toDelete = 1)
END

Of course you can experiment which is the best value to use for the batch, I've used from 500 - 50000 depending on the table. If you use cascade delete, you will probably need a smaller number as you have those child records to delete.

查看更多
登录 后发表回答