Bulk insert from C# list into SQL Server into mult

2020-08-01 12:48发布

问题:

I am completely clueless with this problem, Any help would be highly appreciated:

I have two tables, one is the master data table (Table A), the other table (Table B) has a foreign key relationship with multiple entries (to be specific 18) for one entry in Table A.

I am getting the data in a list and wish to insert it in SQL Server database.

I am currently using the below pattern but is taking 14 minutes for inserting 100 rows in Table A and corresponding 18*100 rows in Table B.

using (SqlConnection conn = new SqlConnection(conStr))
{
    foreach (var ticket in Tickets)
    {
        sql = string.Format(@"INSERT INTO dbo.Tickets([ColumnA], [ColumnB] ,..." + @")
                              VALUES(@ColumnA, @ColumnB,@ColumnC, @ColumnD, .... +
                            @"SELECT SCOPE_IDENTITY();");

        using (cmd = new SqlCommand(sql, conn))
        {
            cmd.Parameters.AddWithValue("@ColumnA", (object)ticket.Id ?? DBNull.Value);
            cmd.Parameters.AddWithValue("@ColumnB", (object)ticket.Address ?? DBNull.Value);
            cmd.Parameters.AddWithValue("@ColumnC", (object)ticket.Status?? DBNull.Value);
            ....

            conn.Open();
            TableA_TicketId = Convert.ToInt32(cmd.ExecuteScalar());
        }
    }
} 

I use SCOPE_IDENTITY() to get the latest identity from table A for each record inserted and use it for insertion in second table

sql = string.Format(@"INSERT INTO Tickets_Fields ([TableA_TicketId], [FieldName], [Key],[Value]) 
                      VALUES (@TableA_TicketId, @FieldName, @Key, @Value);");

using (cmd = new SqlCommand(sql, conn))
{
    foreach (var customField in ticket.CustomFields)
    {
        cmd.Parameters.Clear();
        cmd.Parameters.AddWithValue("@TableA_TicketId", (object)TicketId ?? DBNull.Value);
        cmd.Parameters.AddWithValue("@FieldName", (object)"CustomField" ?? DBNull.Value);
        ...
        cmd.ExecuteNonQuery();
    }
}

conn.Close();

Please suggest if I can improve the performance of this code by any means. Or is their any better/faster way of doing it?

回答1:

Some ideas:

  1. Keep the same connection open during the whole batch insertion. Open it at the beginning, then only close it when you're done.

  2. Don't recreate the SqlCommands during each loop iteration. Create them once at the very beginning, then only update the parameters' values: cmd.Parameters["@x"].Value = …;.

  3. You're inserting into the 2nd table (B) via a foreach loop that inserts single records. You could look into replacing this with a single INSERT INTO TableB (x, y, z) SELECT x, y, z FROM @tvp, where @tvp is a table-valued parameter. Essentially, this means that you can populate e.g. a DataTable with rows you want to insert into the 2nd table, then pass that DataTable over as @tvp. TVPs are supported from SQL Server 2008 onwards, IIRC. Setting one of these up takes a little study the first time.

    (I'm not quite certain if the above INSERT statement would actually work, or whether TVPs only work as parameters to stored procedures (see e.g. this example).)

  4. Going further than #3, move the insertions into tables A and B into a DB stored procedure. This SP would have as parameters the values that go into table A, as well as a table-valued parameter with the records that go into table B.



回答2:

SqlBulkCopy is your friend

using System;
using System.Data;
using System.Data.SqlClient;

namespace SqlBulkInsertExample
{
class Program
{
  static void Main(string[] args)
  {
        DataTable prodSalesData = new DataTable("ProductSalesData");

        // Create Column 1: SaleDate
        DataColumn dateColumn = new DataColumn();
        dateColumn.DataType = Type.GetType("System.DateTime");
        dateColumn.ColumnName = "SaleDate";

        // Create Column 2: ProductName
        DataColumn productNameColumn = new DataColumn();
        productNameColumn.ColumnName = "ProductName";

        // Create Column 3: TotalSales
        DataColumn totalSalesColumn = new DataColumn();
        totalSalesColumn.DataType = Type.GetType("System.Int32");
        totalSalesColumn.ColumnName = "TotalSales";

        // Add the columns to the ProductSalesData DataTable
        prodSalesData.Columns.Add(dateColumn);
        prodSalesData.Columns.Add(productNameColumn);
        prodSalesData.Columns.Add(totalSalesColumn);

        // Let's populate the datatable with our stats.
        // You can add as many rows as you want here!

        // Create a new row
        DataRow dailyProductSalesRow = prodSalesData.NewRow();
        dailyProductSalesRow["SaleDate"] = DateTime.Now.Date;
        dailyProductSalesRow["ProductName"] = "Nike";
        dailyProductSalesRow["TotalSales"] = 10;

        // Add the row to the ProductSalesData DataTable
        prodSalesData.Rows.Add(dailyProductSalesRow);

        // Copy the DataTable to SQL Server using SqlBulkCopy
        using (SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=dbProduct;Integrated Security=SSPI;Connection Timeout=60;Min Pool Size=2;Max Pool Size=20;"))
        {
            dbConnection.Open();
            using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
            {
                s.DestinationTableName = prodSalesData.TableName;

                foreach (var column in prodSalesData.Columns)
                    s.ColumnMappings.Add(column.ToString(), column.ToString());

                s.WriteToServer(prodSalesData);
            }
        }
    }
}
}

Notice that per default it will lock the table until it is done, meaning that anyone else working on the site will be unable to write to the same table.

To get around that, you can set the SqlBulkCopy.BatchSize, but then you have to take notice that if your import fails, you are responsible for deleting the already committed rows.



回答3:

You should use SqlTransaction or TransactionScope to ensure insert in both tables has succeed.

Grab Max(id) from Table A. Insert records in Table A using something similar to this:

using (var connection = new SqlConnection(ConfigurationManager.ConnectionStrings["SomeConnectionString"].ConnectionString))
    {
         connection.Open();
         SqlTransaction transaction = connection.BeginTransaction();

         using (var bulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.Default, transaction))
         {
            bulkCopy.BatchSize = 100;
            bulkCopy.DestinationTableName = "dbo.Person";
            try
            {
                bulkCopy.WriteToServer(listPerson.AsDataTable());
            }
            catch (Exception)
            {
                transaction.Rollback();
                connection.Close();
            }
          }

          transaction.Commit();
    }

Then insert records into table B. You will know from which ID you need to count ID's because you have selected Max(id) before insert.

Refer to this article for complete example of BulkInsert with minimum lines of code.