I have to insert about 2 million rows from a text file.
And with inserting I have to create some master tables.
What is the best and fast way to insert such a large set of data into SQL Server?
I have to insert about 2 million rows from a text file.
And with inserting I have to create some master tables.
What is the best and fast way to insert such a large set of data into SQL Server?
I use the bcp utility. (Bulk Copy Program) I load about 1.5 million text records each month. Each text record is 800 characters wide. On my server, it takes about 30 seconds to add the 1.5 million text records into a SQL Server table.
The instructions for bcp are at http://msdn.microsoft.com/en-us/library/ms162802.aspx
Re the solution for SqlBulkCopy:
I used the StreamReader to convert and process the text file. The result was a list of my object.
I created a class than takes
Datatable
or aList<T>
and a Buffer size (CommitBatchSize
). It will convert the list to a data table using an extension (in the second class).It works very fast. On my PC, I am able to insert more than 10 million complicated records in less than 10 seconds.
Here is the class:
}
Here is an example when I want to insert a List of my custom object
List<PuckDetection>
(ListDetections
):The
BulkInsert
class can be modified to add column mapping if required. Example you have an Identity key as first column.(this assuming that the column names in the datatable are the same as the database)I think its better you read data of text file in DataSet
Try out SqlBulkCopy - Bulk Insert into SQL from C# App
or
after doing step 1 at the top
you can check this article for detail : Bulk Insertion of Data Using C# DataTable and SQL server OpenXML function
But its not tested with 2 million record, it will do but consume memory on machine as you have to load 2 million record and insert it.
I ran into this scenario recently (well over 7 million rows) and eneded up using sqlcmd via powershell (after parsing raw data into SQL insert statements) in segments of 5,000 at a time (SQL can't handle 7 million lines in one lump job or even 500,000 lines for that matter unless its broken down into smaller 5K pieces. You can then run each 5K script one after the other.) as I needed to leverage the new sequence command in SQL Server 2012 Enterprise. I couldn't find a programatic way to insert seven million rows of data quickly and efficiently with said sequence command.
Secondly, one of the things to look out for when inserting a million rows or more of data in one sitting is the CPU and memory consumption (mostly memory) during the insert process. SQL will eat up memory/CPU with a job of this magnitude without releasing said processes. Needless to say if you don't have enough processing power or memory on your server you can crash it pretty easily in a short time (which I found out the hard way). If you get to the point to where your memory consumption is over 70-75% just reboot the server and the processes will be released back to normal.
I had to run a bunch of trial and error tests to see what the limits for my server was (given the limited CPU/Memory resources to work with) before I could actually have a final execution plan. I would suggest you do the same in a test environment before rolling this out into production.
SqlBulkCopy is good.
but you should prepare Data is no preblem, such as the PK\FK not conflict, so insert 200k recored to SQL Server is very easy.
I tried with this method and it significantly reduced my database insert execution time.
*Create SQL connection object and replace it where I have written SQLconnectionObject.