I have gone through the Insert 2 million rows into SQL Server quickly link and found that I can do this by using Bulk insert. So I am trying to create the datatable (code as below), but as this is a huge file (more than 300K row) I am getting an OutOfMemoryEexception
in my code:
string line;
DataTable data = new DataTable();
string[] columns = null;
bool isInserted = false;
using (TextReader tr = new StreamReader(_fileName, Encoding.Default))
{
if (columns == null)
{
line = tr.ReadLine();
columns = line.Split(',');
}
for (int iColCount = 0; iColCount < columns.Count(); iColCount++)
{
data.Columns.Add("Column" + iColCount, typeof(string));
}
string[] columnVal;
while ((line = tr.ReadLine()) != null)
{
columnVal = line.Split(','); // OutOfMemoryException throwing in this line
data.Rows.Add(columnVal);
}
}
after long work I modified my code to as below but then also I am getting OutOfMemoryException at the time of adding rows into datatable
DataTable data = new DataTable();
string[] columns = null;
var line = string.Empty;
using (TextReader tr = new StreamReader(_fileName, Encoding.Default))
{
if (columns == null)
{
line = tr.ReadLine();
columns = line.Split(',');
}
for (int iColCount = 0; iColCount < columns.Count(); iColCount++)
{
data.Columns.Add("Column" + iColCount, typeof(string));
}
}
// Split the rows in 20000 rows in different list
var _fileList = File.ReadLines(_fileName, Encoding.Default).ToList();
var splitChunks = new List<List<string>>();
splitChunks = SplitFile(_fileList, 20000);
Parallel.ForEach(splitChunks, lstChunks =>
{
foreach (var rows in lstChunks)
{
string[] lineFields = rows.Split(',');
DataRow row = datatbl.NewRow();
for (int iCount = 0; iCount < lineFields.Count(); iCount++)
{
row[iCount] = lineFields[iCount] == string.Empty ? "" : lineFields[iCount].ToString();
}
datatbl.Rows.Add(row);
}
});
I can do the bulk insert for next level as the below code:
SqlConnection SqlConnectionObj = GetSQLConnection();
SqlBulkCopy bulkCopy = new SqlBulkCopy(SqlConnectionObj, SqlBulkCopyOptions.TableLock | SqlBulkCopyOptions.FireTriggers | SqlBulkCopyOptions.UseInternalTransaction, null);
bulkCopy.DestinationTableName = "TempTable";
bulkCopy.WriteToServer(data);
File contains below kind of data
4714,1370,AUSRICHTEN MASCHINELL
4870,1370,PLATTE STECKEN
0153,1900,CAULK GUN
0154,1900,NEW TERMINATOR
0360,1470,MU 186 MACCH. X LAV. S/A ASTE PS174
9113-H22,1970,MC DRILL BITS
Code need to convert this into 6 rows and 3 columns.
Is there any faster way to achieve the above functionality to read the file and create the datatable for bulk insert? So that I should not get memory out of index exception.
Thanks in advance.
Solution with SqlBulkCopy.WriteToServer and IDataReader. I'm using CSV, but I hope it will easy to modify for other types.
SqlBulkCopy
uses just 3 things fromIDateReader
and we have to implement them:public int FieldCount {get; }
public bool Read()
public object GetValue(int i)
All other properties and methods can be unimplemented. Interesting paper about SqlBulkCopy. Full code: https://dotnetfiddle.net/giG3Ai. Here is with cutted version:
I found forgetting a DataTable and using plain old SQLClient on a row-by-row basis was quicker. Simpler too. This also beat a streaming SQL function that was supposedly the fastest way to get data in a SQL Server DB.
Try it and measure the speed, see if its fast enough for you. If it isn't you could always try to re-format the file (if necessary) and let SQL Server do the work for you using its Bulk Insert.
This is a lot of data to put in memory.
Instead of this you should do is every certain amount of rows you read from the text file - you need to insert it into the database.
How you do it is up to you, you can use SQL, or bulk copy - but keep in mind, you can't read the entire text file and keep it in memory, so do it in chunks.