I have a rather big table named FTPLog with around 3 milion record I wanted to add a delete mechanism to delete old logs but delete command takes long time. I found that clustered index deleting takes long time.
DECLARE @MaxFTPLogId as bigint
SELECT @MaxFTPLogId = Max(FTPLogId) FROM FTPLog WHERE LogTime <= DATEADD(day, -10 , GETDATE())
PRINT @MaxFTPLogId
DELETE FROM FTPLog WHERE FTPLogId <= @MaxFTPLogId
I want to know how can I improve performance of deleting?
It might be slow because a large delete generates a big transaction log. Try to delete it in chunks, like:
This generates smaller transactions. And it mitigates locking issues by creating breathing space for other processes.
You might also look into partitioned tables. These potentially allow you to purge old entries by dropping an entire partition.
Check the density of your table (use command DBCC showcontig to check density) Scan Density [Best Count:Actual Count] this parameter should be closer to 100% and Logical Scan Fragmentation parameter should be closer to 0% for best performance of your table. If it is not, re-index and refragment the index of that table to improve performance of your query execution.
I assume that not only this table is huge in terms of number of rows, but also that it is really heavily used for logging new entries while you try to clean it up.
Suggestion of Andomar should help, but I would try to clean it up when there are no inserts going on.
Alternative: when you write logs, you probably do not care about the transaction isolation so much. Therefore I would change transaction isolation level for the code/processes that write the log entries so that you may avoid creating huge
tempdb
(by the way, check if tempdb grows a lot during this DELETE operation)Also, I think that deletions from the clustered index should not be really slower then from non-clustered one: you are still psysically deleting rows. Rebuilding this index afterward may take time though.
Since it's a log table, there is no need to make is clustered.
It's unlikely that you will search it on
Id
.Alter your
PRIMARY KEY
so that it's unclustered. This will useHEAP
storage method which is faster onDML
:, and just issue: