Would it be best to call context.SaveChanges only

2019-09-03 22:49发布

问题:

I'm semi new to EF and I'm converting a Windows Service App over to using EF. It will read a text file with 11k employees in it and look to see if a record exists or not and either do an insert or update.

Currently I'm looping through each line, create a new entity object with the values, pass that to a data class and in there it will look up the entity to see if it will insert or update. Thing is I'm calling the context.SaveChanges() each record.

Now I'm wondering if I should create a list of entities for all employees. Then I would send that list into the data class, loop through the list, set the modified state or attach and then do the SaveChanges after I've gone through the whole list. Would that work and would it only call an insert/update one time or would it still a query 11k times for all the records?

Just looking for a fast way to do this.

foreach (string line in lines)
{
    if (line.Trim().Length > 0)
    {
        SAP_DATES dateRecord = new SAP_DATES();
        dateRecord.EMP = line.Substring(0, 8);
        dateRecord.DATE_TYPE = line.Substring(10, 20);
        dateRecord.DATE_VALUE = Convert.ToDateTime(line.Substring(30, 10));
        dateRecord.DATE_MODIFIED = Convert.ToDateTime(line.Substring(40, 10)); 

        Data.DataManager.ProcessDateRecord(dateRecord);     
    }
}

This is in my Data Class

public static void ProcessDateRecord(SAP_DATES dateRecord)
{
    using (DataContext db = new DataContext())
    {
        var original = db.SAP_DATES.Find(dateRecord.EMP, dateRecord.DATE_TYPE);

        if (original != null)
        {
            db.Entry(original).CurrentValues.SetValues(dateRecord);                    
        }
        else
        {
            db.SAP_DATES.Add(dateRecord);
        }
        db.SaveChanges();
    }     
}

回答1:

Both technique will be very slow but using SaveChanges every X records may improve the performance by a little bit.

Why is slow?

  • The Find method perform a database round trip for every record
  • The “Add” or “Update” perform a database round trip for every record

So if you have 11k records, both of your solution will make 22k database round trip which is insane.

Disclaimer: I'm the owner of the project Entity Framework Extensions

If you are really looking for the faster way to Upsert records, this library will improve dramatically your performance by 2000% and more!

The merge statements will update existing records and insert new records.

using (DataContext db = new DataContext())
{
    // Using EF Key
    db.BulkMerge(list);

    // Using a Custom Key
    db.BulkMerge(list, ctx.BulkMerge(list, 
          operation => operation.ColumnPrimaryKeyExpression = date => 
                       new { date.EMP, date.DATE_TYPE });
}