Using SQLBulkCopy - Significantly larger tables in

2019-05-21 02:57发布

I have an application that uses SqlBulkCopy to move data into a set of tables. It has transpired recently that users that are using SQL2016 are reporting problems with their harddrives being filled with very large databases (that should not be that large). This problem does not occur in SQL2014. Upon inspection it appears that running TableDataSizes.sql (script attached) showed large amounts of space in UnusedSpaceKB.

I would like to know if a) There is some bug in SQLServer 2016 or if our use of SQLBulkCopy has "clashed" with a new feature. I note that there has been some changes to Page Allocation in SQLServer 2016. In general - What is causing this?

Steps to Reproduce Note – The below describes a situation I am seeing with non-essential information removed. I am not actually storing thousands of timestamps in a database table (the other columns have been removed).

  1. Create a database in SQL (mine was called TestDB)
  2. Create a table in that DB (using script as below)

    USE [TestDB]
    GO
    
    /****** Object:  Table [dbo].[2017_11_03_DM_AggregatedPressure_Data]    Script Date: 07/11/2017 10:30:36 ******/
    SET ANSI_NULLS ON
    GO
    
    SET QUOTED_IDENTIFIER ON
    GO
    
    CREATE TABLE [dbo].[TestTable](
        [TimeStamp] [datetime] NOT NULL
    ) ON [PRIMARY]
    
    GO
    
  3. Create an index on that table (using Script as below)

    USE [TestDB]
    GO
    
    /****** Object:  Index [2017_11_03_DM_AggregatedPressure_Data_Index]    Script Date: 07/11/2017 10:32:44 ******/
    CREATE CLUSTERED INDEX [TestTable_Index] ON [dbo].[TestTable]
    (
       [TimeStamp] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
    GO
    
  4. Start to run records into the table using the code provided below. (This is the code behind for a windows form that simply has a button called btnGo on it and a numericUpDown called nupRecordsToInsert.

    Public Class Form1
    
    Private conStr As String = "Integrated Security=true;Persist Security Info=true;Server=.;Database=TestDB;Pooling=True"
    Dim tableName As String = "TestTable"
    
    Private Sub btnGo_Click(sender As Object, e As EventArgs) Handles btnGo.Click
    
        Dim table as DataTable = GetData(nupRecordsToInsert.Value)
    
        Using conn As SqlConnection = New SqlConnection(conStr)
            conn.Open()
            Using sbc As SqlBulkCopy = New SqlBulkCopy(conStr, SqlBulkCopyOptions.UseInternalTransaction Or SqlBulkCopyOptions.KeepIdentity)
    
                sbc.DestinationTableName = "[" & tableName & "]"
                sbc.BatchSize = 1000
                sbc.WriteToServer(table)
    
            End Using
        End Using
    
        MessageBox.Show($"Records Inserted = {nupRecordsToInsert.Value} into Database - TestDB. Table - {tableName}")
    End Sub
    
    Private Function GetData(numOfRecordsNeeded As Integer) As DataTable
        Dim table As DataTable = New DataTable()
        table.Columns.Add("TimeStamp", GetType(DateTime))   
    
        Dim dtDateTimeToInsert as DateTime = DateTime.Now
    
        For index As Integer = 1 To numOfRecordsNeeded
            dtDateTimeToInsert = dtDateTimeToInsert.AddSeconds(2)
            table.Rows.Add(dtDateTimeToInsert) 
        Next
    
        Return table
    End Function
    

    End Class

  5. At some point around 500 records the number of items in the database table will mean that new records will need to be written onto a new page. At this point interesting this happen as outlined in Actual Results.

Actual Results The databases in SQL2016 are extremely large (this occurs after the first page has been filled and a second one is started).

This can be seen in more detail when

  1. Running the below SQL to get an idea of the tablesizes. The more records you run into the database the more you see extremely large numbers in the UnusedSpaceKB column.

    use [TestDB]
    
    SELECT 
       t.NAME AS TableName,
       s.Name AS SchemaName,
       p.rows AS RowCounts,
       SUM(a.total_pages) * 8 AS TotalSpaceKB, 
       SUM(a.used_pages) * 8 AS UsedSpaceKB, 
       (SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
    FROM 
       sys.tables t
    INNER JOIN      
       sys.indexes i ON t.OBJECT_ID = i.object_id
    INNER JOIN 
       sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
    INNER JOIN 
       sys.allocation_units a ON p.partition_id = a.container_id
    LEFT OUTER JOIN 
       sys.schemas s ON t.schema_id = s.schema_id
    WHERE 
      t.NAME = 'TestTable'
      AND t.is_ms_shipped = 0
      AND i.OBJECT_ID > 255 
    GROUP BY 
      t.Name, s.Name, p.Rows
    ORDER BY 
      RowCounts desc
    

Output showing large number in UnusedSpaceKB's

enter image description here

  1. Running the below query shows that many pages have been allocated but that only the first one out of every 'set of 8' is used. This leaves the last 7 of every 8 pages unused and thus creates a lot of wasted space.

     select * from sys.dm_db_database_page_allocations
     (DB_id() , object_id('[dbo].[TestTable]') , NULL , NULL , 'DETAILED')
    

The below shows part of the results where the page allocations do not run continously.
Showing spaces in database_page_allocations results

The databases in SQL2014 do not show this problem 1. When running the appropriate query (as above) we do not see large values in the UnusedSpaceKB column.

  1. Running the other query (that queries - dm_db_database_page_allocations) shows many pages have been allocated but each page is being used in sequence. There are no gaps - no blocks of 7 unused pages.

Contiguous page allocations

Expected Results I would expect SQL2016 to behave like SQL2014 and not create extremely large tables. In particular I would expect the pages to be allocated contigously and not have 7 page gaps in the allocation.

If anyone has any thought on why I see this difference it would be tremendously helpful.

1条回答
看我几分像从前
2楼-- · 2019-05-21 03:33

You need to use trace flag 692:

If for any reason, you cannot change the batchsize or if you are not seeing an improved data load performance with default minimal logging behavior, you can disable fast inserts behavior in SQL Server 2016 using trace flag 692 (...). We do not anticipate under normal circumstances this trace flag will be required for customers.

查看更多
登录 后发表回答