-->

C# & SQL Server - best way to delete multiple Rows

2020-07-22 16:43发布

问题:

I know there's many of the same subjected question here in SO, my question is if I want to delete say around 1K rows rather few, given a List<int> of RecordID, I could avoid using a DataTable, and send the list translated into a statement:

string ParmRecordsToDelete_CsvWhereIN = "("
for(int CurIdx=0; CurIdx < RecIdsToDelete.Count; CurIdx++)
{
  ParmRecordsToDelete_CsvWhereIN += RecIdsToDelete[CurIdx] + ", ";
  //this method to create passed parameter 
  //logic to remove on last Coma on last Index..

  //or use stringJoin and somehow remove the last coma 
}
ParRecordsToDelete_CsvWhereIN +=")";

This will create something Like "('1','2','3'......)"

and then create a SqlCommand to call a stored procedure:

Delete * From @TblName WHERE @ColName IN @RecordsToDeleteCsvWhereIN

Is this an efficient approach? Is there a limit to the length of single parameter? I guess it's the N/VARCHAR(MAX) length.

I guess if it's not a kinda hackish solution it would not be limited with length...

What would be the best fast solution, or am I on the right track as it is?

回答1:

You could use table valued parameters to hand this. The application layer would look something like

C#

var tvp = new DataTable();
tvp.Columns.Add("Id", typeof(int));

foreach(var id in RecIdsToDelete)
    tvp.Rows.Add(new {id});

var connection = new SqlConnection("your connection string");

var delete = new SqlCommand("your stored procedure name", connection)
{
  CommandType = CommandType.StoredProcedure
};

delete
  .Parameters
  .AddWithValue("@ids", tvp)
  .SqlDbType = SqlDbType.Structured;

delete.ExecuteNonQuery();

SQL

IF NOT EXISTS(SELECT * FROM sys.table_types WHERE name = 'IDList')
BEGIN
    CREATE TYPE IDList AS TABLE(ID INTEGER)
END


CREATE PROCEDURE School.GroupStudentDelete
(                                         
        @IDS IDLIST READONLY      
)                                         
AS

SET NOCOUNT ON;

BEGIN TRY
        BEGIN TRANSACTION

        DECLARE @Results TABLE(id INTEGER)

        DELETE 
        FROM TblName 
        WHERE Id IN (SELECT ID FROM @IDS)        

        COMMIT TRANSACTION
END TRY
BEGIN CATCH
        PRINT ERROR_MESSAGE();

        ROLLBACK TRANSACTION
        THROW; -- Rethrow exception
END CATCH
GO

There are a number of advantages to this approach over building strings

  • You avoid creating queries in the application layer, creating a seperation of concerns
  • You can more easily test execution plans and optimize queries
  • You are less vulnerable to SQL injection attacks, since your given approach would not be able to use a paramaterized query to build the IN clause
  • The code is more readable and illustrative
  • You do not end up building excessively long strings

Performance

There are some considerations about the performance of TVPs on large datasets.

Because TVPs are variables, they do not compile statistics. This means the query optimizer can fudge the execution plan sometimes. If this happens there a couple options :

  • set OPTION (RECOMPILE) on any TVP statements where indexing is an issue
  • write the TVP into a local temp and setup the indexing there

Here is a great article on TVP's with a good section on performance considerations, and what to expect when.

So if you are worried about hitting limits on string parameters, the table valued parameters might be the way to go. But in the end, it is hard to say without knowing more about the data set you are working with.



回答2:

IN @Parameter is not an option, such thing does not work. You can hard wire the ids into IN (1,2,3,4...), but that is bad. I measured some lightweight select of 30000 ids:

TVP: 339ms first run, 319ms second run
hard wired: 67728ms first run, 42ms second run

As You see, when SQL server has to parse enormous string, it takes really long. On second run the query plan can be taken from execution plan cache, unfortunately with big id ranges this is extremely unlikely. It just wastes execution plan cache.

TVP can scale into milion of ids without problem, hard wired string causes sql server to fail the query for less than 100000 ids. It has nothing to do with string max length, it just can not process it.

Btw. if You build string like this, use StringBuilder or string.Join, appending to string in loop is very inefficient.



回答3:

A better approach would be to use a table-valued parameter. You have to define a type for the parameter, but it's more efficient than specifying values in a string, especially numeric or date values because the server doesn't have to parse the string to get out the individual values.

I'm not sure what your ID type is, but if it's 'BIGINT', for example:

IF NOT EXISTS (SELECT * FROM dbo.systypes WHERE name='IDList')
    CREATE TYPE IDList AS TABLE (Id BIGINT);
GO

To initialize the type, then to create a stored procedure using it, something like this:

IF NOT EXISTS (SELECT * FROM dbo.sysobjects WHERE name='DeleteMultipleRecords')
   EXECUTE sp_executesql N'CREATE PROCEDURE DeleteMultipleRecords AS BEGIN SET NOCOUNT ON; END'
GO
ALTER PROCEDURE [dbo].[DeleteMultipleRecords]
   @IDs IDList READONLY
AS
BEGIN
   SET NOCOUNT ON

    DELETE FROM [Table] WHERE Id IN (SELECT Id FROM @IDs)
END

You can also use it with dynamic SQL from C#.