bulk delete in entity framework

2020-02-12 09:28发布

问题:

I'd like to bulk delete records from a table using linq. There's a post that describes how to do it: Bulk-deleting in LINQ to Entities

var query = from c in ctx.Customers
            where c.SalesPerson.Email == "..."
            select c;

query.Delete();

But the function "Delete" doesn't exist in my var variable.
Furthermore, the function "SubmitChanges" doesn't exist on my context.

回答1:

There is an interesting NuGet package that lets you do batch deletes and updates:



回答2:

There is no currently supported bulk delete baked into Entity Framework. Its actually one of the features being discussed on codeplex now EF is open-source.

EntityFramework.Extended provides batch delete support (you can find this in nuget) however my experience is that it has some performance issues.



回答3:

This code adds a simple extension method to any DbContext that will bulk delete all data in any table referenced within the entity framework query you provide. It works by simply extracting all table names involved in the query, and attempting to delete the data by issuing a "DELETE FROM tablename" SQL query, which is common across most types of database.

To use, simply do this:

myContext.BulkDelete(x => x.Things);

which will delete everything in the table linked to the Things entity store.

The code:

using System.Linq;
using System.Text.RegularExpressions;

namespace System.Data.Entity {

    public static class DbContextExtensions {

        /// <summary>
        /// Efficiently deletes all data from any database tables used by the specified entity framework query. 
        /// </summary>
        /// <typeparam name="TContext">The DbContext Type on which to perform the delete.</typeparam>
        /// <typeparam name="TEntity">The Entity Type to which the query resolves.</typeparam>
        /// <param name="ctx">The DbContext on which to perform the delete.</param>
        /// <param name="deleteQuery">The query that references the tables you want to delete.</param>
        public static void BulkDelete<TContext, TEntity>(this TContext ctx, Func<TContext, IQueryable<TEntity>> deleteQuery) where TContext : DbContext {

            var findTables = new Regex(@"(?:FROM|JOIN)\s+(\[\w+\]\.\[\w+\])\s+AS");
            var qry = deleteQuery(ctx).ToString();

            // Get list of all tables mentioned in the query
            var tables = findTables.Matches(qry).Cast<Match>().Select(m => m.Result("$1")).Distinct().ToList();

            // Loop through all the tables, attempting to delete each one in turn
            var max = 30;
            var exception = (Exception)null;
            while (tables.Any() && max-- > 0) {

                // Get the next table
                var table = tables.First();

                try {
                    // Attempt the delete
                    ctx.Database.ExecuteSqlCommand(string.Format("DELETE FROM {0}", table));

                    // Success, so remove table from the list
                    tables.Remove(table);

                } catch (Exception ex) {
                    // Error, probably due to dependent constraint, save exception for later if needed.                    
                    exception = ex;

                    // Push the table to the back of the queue
                    tables.Remove(table);
                    tables.Add(table);
                }
            }

            // Error error has occurred, and cannot be resolved by deleting in a different 
            // order, then rethrow the exception and give up.
            if (max <= 0 && exception != null) throw exception;

        }
    }
}


回答4:

I do it like this which seems to work fine. Please let know if there is a reason why this is bad practice in any way.

        var customersToDelete = await ctx.Customers.Where(c => c.Email == "...").ToListAsync();

        foreach (var customerToDelete in customersToDelete)
        {
            ctx.Customers.Remove(customerToDelete);
        }

        await ctx.SaveChangesAsync();


回答5:

I was experiencing the same problem with EF executing thousands of DELETE queries after SaveChanges call. I wasn't sure that EntityFramework.Extensions commercial library would help me so I decided to implement bulk DELETE myself and came up with something similar to BG100's solution!

public async Task<List<TK>> BulkDeleteAsync(List<TK> ids)
{
    if (ids.Count < 1) {
        return new List<TK>();
    }

    // NOTE: DbContext here is a property of Repository Class
    // SOURCE: https://stackoverflow.com/a/9775744
    var tableName = DbContext.GetTableName<T>();
    var sql = $@"
        DELETE FROM {tableName}
        OUTPUT Deleted.Id
        // NOTE: Be aware that 'Id' column naming depends on your project conventions
        WHERE Id IN({string.Join(", ", ids)});
    ";
    return await @this.Database.SqlQuery<TK>(sql).ToListAsync();
}

If you have something like generic repository that should work for you just fine. At least you could try to fit it into your EF infrastructure.

I also tweaked it a bit more and was able to execute queries on multiple chunks of entities. It would help you if there are any restrictions of query size in your DB.

const int ChunkSize = 1024;
public async Task<List<TK>> BulkDeleteAsync(List<TK> ids)
{
    if (ids.Count < 1) {
        return new List<TK>();
    }
    // SOURCE: https://stackoverflow.com/a/20953521/11344051
    var chunks = ids.Chunks(ChunkSize).Select(c => c.ToList()).ToList();
    var tableName = DbContext.GetTableName<T>();

    var deletedIds = new List<TK>();
    foreach (var chunk in chunks) {
        var sql = $@"
            DELETE FROM {tableName}
            OUTPUT Deleted.Id
            WHERE Id IN({string.Join(", ", chunk)});
        ";
        deletedIds.AddRange(DbContext.Database.SqlQuery<TK>(sql).ToListAsync());
    }
    return deletedIds;
}