I am using SQL Server Data Tools 2013 to create an SSIS package. This package has an Execute SQL Task with a Full Resultset option to push the query results into an SSIS Variable, of type Object.
I'm using the following in a script task to take a recordset stored in an object variable and write it to a CSV:
Public Sub Main()
Dim fileName As String = Dts.Variables("vFileName").Value.ToString
Dim destinationPath As String = Dts.Variables("vDestinationPath").Value.ToString
Dim destinationPathAndFileName As String = destinationPath + fileName
Dim fileContents As String = ""
Dim oleDB As OleDbDataAdapter = New OleDbDataAdapter()
Dim table As DataTable = New DataTable()
Dim rs As System.Object = Dts.Variables("vResultSet").Value
' Populate DataTable with vResultSet data
oleDB.Fill(table, rs)
' Loop through columns and concatenate with commas
For Each col As DataColumn In table.Columns
fileContents &= col.ColumnName & ","
Next
' Remove final comma from columns string and append line break
fileContents = fileContents.Substring(0, fileContents.Length - 1)
fileContents &= Environment.NewLine
' Loop through rows and concatenate with commas
Dim i As Integer
For Each row As DataRow In table.Rows
For i = 1 To table.Columns.Count
fileContents &= row(i - 1).ToString() & ","
Next
' Remove final comma from row string and append line break
fileContents = fileContents.Substring(0, fileContents.Length - 1)
fileContents &= Environment.NewLine
Next
' Write all text to destination file. If file exists, this step will overwrite it.
System.IO.File.WriteAllText(destinationPathAndFileName, fileContents)
Dts.TaskResult = ScriptResults.Success
End Sub
This works, but it's veeeery slow, like 25+ minutes to write a single 14k-row dataset to CSV. I can't use a data flow because this process exists in a loop, and the metadata for each table to be exported is different. I'm pretty sure a script task is the only option, but is there a faster way than looping through each row of the dataset? Please let me know if I can provide more info.
This is the VB.NET version of @billinkc's excellent answer in case it's useful to anyone:
Imports System
Feel free to translate to VB.NET as you see fit. Seeing as how I already have this code ~ written for a different project, I mashed your request in with how mine works
Passing in 3 SSIS variables: vFileName, vDestinationPath and vResultSet, the code in Main will convert the ado recordset into a DataTable which is then added to a DataSet and passed to the Persist method.
Persist
has a default parameter fordelimiter
of|
.This implementation does not attempt to deal with any of the corner cases, at all. It does not escape text columns with a qualifier, doesn't escape embedded qualifiers, do anything with newlines in the feeds and something in the
OleDbDataAdapter
's fill method fails with binary data, etcNeed to run but a Biml implementation looks like
That dumped all of AdventureworksDW2014 in 15 seconds
Based on the comment that this line is failing
IEnumerable<string> fields = (row.ItemArray).Select(field => field.ToString());
Ensure that you have the following using statements in your project. I think those extensions are in the Linq namespaces but it could have been the Collections
Why was the original slow?
My assumption is the slowness boils down to all that concatenation. Strings are immutable in .Net and you are creating a new version of that string each time you add a column to it. When I build my line, I'm using the String.Join method to zip up each element an array into a single string. This also simplifies the logic required to append the field delimiters.
I also immediately write the current line to a file instead of bloating my memory just to dump it all with a call to
WriteAllText