Libraries for ADO.NET to rapidly bulk insert data

2019-01-14 04:20发布

I'd like to know if you can recommend any advanced ADO.NET libraries for working with databases.

I've discovered that LINQ-to-Entities is great for pulling data out of databases, but not at all useful for inserting data into databases. Its missing functionality like fast bulk insert, culling of duplicates, and most of the advanced functionality you can achieve with pure SQL.

So: can you recommend some ADO.NET libraries that offer the sorts of advanced functionality that LINQ-to-Entities is missing?

2条回答
别忘想泡老子
2楼-- · 2019-01-14 04:55

The ADO.net SqlBulkCopy class enables quick, mass upload of records into a table:

    DataTable dt = s_EmptyUploadTable.Copy();
    foreach (var itm in yourList) {
        DataRow row = dt.NewRow();
        row["Field1"] = itm.Field1;
        row["Field2"] = itm.Field2;
        dt.Rows.Add(row);
    }
    using (SqlConnection cn = new SqlConnection(yourConnectionString)) {
        cn.Open();
        using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn)) {
            bulkCopy.DestinationTableName = "dbo.YourActualSQLServerTableName";
            bulkCopy.WriteToServer(dt);
        }
        cn.Close();
    }
查看更多
爷的心禁止访问
3楼-- · 2019-01-14 05:12

You can use LINQ Entity Data Reader to write an IEnumerable list to a database using SQL Bulk Copy behind the scenes. You can use this library to bulk upload the results of a LINQ query straight into the database, because the results of a LINQ query are IEnumerable.

As there are LINQ-to-everything adapters, you can do tricks like use the LINQ to CSV library to grab the data out of a .csv file using a LINQ query, then the LINQ Entity Data Reader to bulk write this data directly into the database.

Case study:

Problem: read a .csv file quickly into a database. The connection to the SQL database is via LINQ-to-Entitys from C#.

Solution 1: Use LINQ to CSV library, construct a LINQ query to pull out the data you want, then write it in using the standard LINQ-to-Entity calls (ctx.AddObject(), ctx.SaveChanges(), etc). Time taken: 30 seconds for 20,000 records, as LINQ ends up generating a query for every single record (slooooow!!!!!).

Solution 2: Use LINQ to CSV library, construct a LINQ query to pull out the data you want into an IEnumerable, use LINQ Entity Data Reader to bulk write this data directly into the target data table. Time taken: 3 seconds for 20,000 records.

Solution 3: Use a a stored procedure with SQL "bulk copy". Time taken: 2 seconds for 20,000 records. However, this solution is quite brittle as it relies on a stored procedure, and SQL bulk copy is just not compatible with some .csv file formats. This method also requires that you use a staging table between the actual target table and the .csv file, to deal with file formatting issues and to help with normalization.

And, here is the source code for solution #2:

static void WriteCSVtoSQLtable()
{
  // Step 1: Read .csv file into IEnumerable using LINQ-to-CSV class.

  // This section requires "LINQtoCSV" class as described at http://www.codeproject.com/KB/linq/LINQtoCSV.asp

  string inputFilePath = @"T:\x.csv";

  CsvFileDescription inputFileDescription = new CsvFileDescription
  {
    SeparatorChar = ',',
    FirstLineHasColumnNames = true
  };

  IEnumerable<MyCustomColumnMappingClass> csvChains = cc.Read<MyCustomColumnMappingClass>(inputFilePath, inputFileDescription);

  // Step 2: Now write into the target table on SQL Server.

  // This section requires "EntityDataReader" class described at http://archive.msdn.microsoft.com/LinqEntityDataReader.
  public static string dbSqlConnectionString = @";Data Source=(local);Initial Catalog=PhiEngine;Integrated Security=True;MultipleActiveResultSets=True";

  SqlConnection dbSql(dbSqlConnectionString);

  using (var tran = dbSql.BeginTransaction())
  {

    var csvFile = from p in csvChains
                select p;

    SqlBulkCopy bc = new SqlBulkCopy(dbSql,
      SqlBulkCopyOptions.CheckConstraints |
      SqlBulkCopyOptions.FireTriggers |
      SqlBulkCopyOptions.KeepNulls, tran)
                     {
                       BatchSize = 1000,
                       DestinationTableName = "TStagingTable" // Temporary staging table in database.
                     };

    bc.WriteToServer(csvFile.AsDataReader()); // Extension method .AsDataReader depends on adding the EntityDataReader class to your C# project (see above).

    tran.Commit();
  }
}

// This class is used by LINQ to CSV to query the .csv file, see "LINQtoCSV" website.
public class MyCustomColumnMappingClass
{
  [CsvColumn(Name = "symbol", FieldIndex = 1)]
  public string Symbol { get; set; }

  [CsvColumn(Name = "date", FieldIndex = 3, OutputFormat = @"MM/dd/yyyy")]
  public DateTime Date { get; set; }
}
查看更多
登录 后发表回答