Fast Export of Large Datatable to Excel Spreadshee

2020-07-31 11:09发布

问题:

I have an interesting conundrum here, how do I quickly (under 1 minute) export a large datatable (filled from SQL, 35,000 rows) into an Excel spreadsheet for users. I have code in place that can handle the export, and while nothing is "wrong" with the code per se, it is infuriatingly slow taking 4 minutes to export the entire file (sometimes longer if a user has less RAM or is running more on their system). Sadly, this is an improvement over the 10+ minutes it used to take using our old method. Simply put, can this be made any faster, without using 3rd party components? If so, how? My code is as follows, the slow down occurs between messageboxes 6 and 7 where each row is written. Thank you all for taking the time to take a look at this:

    Private Sub btnTest_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnJeffTest.Click
           Test(MySPtoExport)
    End Sub

Private Sub Test(ByVal SQL As String)
    'Declare variables used to execute the VUE Export stored procedure
    MsgBox("start stop watch")
    Dim ConnectionString As New SqlConnection(CType(ConfigurationManager.AppSettings("ConnString"), String))
    Dim cmdSP As New SqlClient.SqlCommand
    Dim MyParam As New SqlClient.SqlParameter
    Dim MyDataAdapter As New SqlClient.SqlDataAdapter
    Dim ExportDataSet As New DataTable
    Dim FilePath As String

    MsgBox("stop 1 - end of declare")

    Try

        ' open the connection
        ConnectionString.Open()

        ' Use the connection for this sql command
        cmdSP.Connection = ConnectionString

        'set this command as a stored procedure command
        cmdSP.CommandType = CommandType.StoredProcedure

        'get the stored procedure name and plug it in
        cmdSP.CommandText = SQL

        'Add the Start Date parameter if required
        Select Case StDt
            Case Nothing
                ' there's no parameter to add
            Case Is = 0
                ' there's no parameter to add
            Case Else
                'add the parameter name, it's direction and its value
                MyParam = cmdSP.Parameters.Add("@StartDate", SqlDbType.VarChar)
                MyParam.Direction = ParameterDirection.Input
                MyParam.Value = Me.txtStartDate.Text
        End Select
        MsgBox("stop 2 - sql ready")
        'Add the End Date parameter if required
        Select Case EdDt
            Case Nothing
                ' there's no parameter to add
            Case Is = 0
                ' there's no parameter to add
            Case Else
                'add the parameter name, it's direction and its value

                MyParam = cmdSP.Parameters.Add("@EndDate", SqlDbType.VarChar)
                MyParam.Direction = ParameterDirection.Input
                MyParam.Value = Me.txtEndDate.Text
        End Select

        'Add the single parameter 1 parameter if required
        Select Case SPar1
            Case Is = Nothing
                ' there's no parameter to add
            Case Is = ""
                ' there's no parameter to add
            Case Else
                'add the parameter name, it's direction and its value
                MyParam = cmdSP.Parameters.Add(SPar1, SqlDbType.VarChar)
                MyParam.Direction = ParameterDirection.Input
                MyParam.Value = Me.txtSingleReportCrt1.Text
        End Select

        'Add the single parameter 2 parameter if required
        Select Case Spar2
            Case Is = Nothing
                ' there's no parameter to add
            Case Is = ""
                ' there's no parameter to add
            Case Else
                'add the parameter name, it's direction and its value
                MyParam = cmdSP.Parameters.Add(Spar2, SqlDbType.VarChar)
                MyParam.Direction = ParameterDirection.Input
                MyParam.Value = Me.txtSingleReportCrt2.Text
        End Select

        MsgBox("stop 3 - params ready")

        'Prepare the data adapter with the selected command 
        MyDataAdapter.SelectCommand = cmdSP

        ' Set the accept changes during fill to false for the NYPDA export
        MyDataAdapter.AcceptChangesDuringFill = False

        'Fill the Dataset tables (Table 0 = Exam Eligibilities, Table 1  = Candidates Demographics)
        MyDataAdapter.Fill(ExportDataSet)

        'Close the connection
        ConnectionString.Close()

        'refresh the destination path in case they changed it
        SPDestination = txtPDFDestination.Text

        MsgBox("stop 4 - procedure ran, datatable filled")

        Select Case ExcelFile
            Case True

                FilePath = SPDestination & lblReportName.Text & ".xls"

                Dim _excel As New Microsoft.Office.Interop.Excel.Application
                Dim wBook As Microsoft.Office.Interop.Excel.Workbook
                Dim wSheet As Microsoft.Office.Interop.Excel.Worksheet

                wBook = _excel.Workbooks.Add()
                wSheet = wBook.ActiveSheet()

                Dim dt As System.Data.DataTable = ExportDataSet
                Dim dc As System.Data.DataColumn
                Dim dr As System.Data.DataRow
                Dim colIndex As Integer = 0
                Dim rowIndex As Integer = 0

                MsgBox("stop 5 - excel stuff declared")

                For Each dc In dt.Columns
                    colIndex = colIndex + 1
                    _excel.Cells(1, colIndex) = dc.ColumnName
                Next

                MsgBox("stop 6 - Header written")

                For Each dr In dt.Rows
                    rowIndex = rowIndex + 1
                    colIndex = 0
                    For Each dc In dt.Columns
                        colIndex = colIndex + 1
                        _excel.Cells(rowIndex + 1, colIndex) = dr(dc.ColumnName)
                    Next
                Next

                MsgBox("stop 7 - rows written")

                wSheet.Columns.AutoFit()

                MsgBox("stop 8 - autofit complete")

                Dim strFileName = SPDestination & lblReportName.Text & ".xls"

                If System.IO.File.Exists(strFileName) Then
                    System.IO.File.Delete(strFileName)
                End If

                MsgBox("stop 9 - file checked")

                wBook.SaveAs(strFileName)
                wBook.Close()
                _excel.Quit()
        End Select

        MsgBox("File " & lblReportName.Text & " Exported Successfully!")


        'Dispose of unneeded objects
        MyDataAdapter.Dispose()
        ExportDataSet.Dispose()
        StDt = Nothing
        EdDt = Nothing
        SPar1 = Nothing
        Spar2 = Nothing
        MyParam = Nothing
        cmdSP.Dispose()
        cmdSP = Nothing
        MyDataAdapter = Nothing
        ExportDataSet = Nothing

    Catch ex As Exception
        '  Something went terribly wrong.  Warn user.
        MessageBox.Show("Error: " & ex.Message, "Stored Procedure Running Process ", _
       MessageBoxButtons.OK, MessageBoxIcon.Error)

    Finally
        'close the connection in case is still open
        If Not ConnectionString.State = ConnectionState.Closed Then
            ConnectionString.Close()
            ConnectionString = Nothing
        End If

        ' reset the fields
        ResetFields()

    End Try
End Sub

回答1:

As when using VBA to automate Excel, you can assign an array directly to the value of a Range object: this is done as a single operation, so you remove the overhead associated with making multiple calls across the process boundaries between your .Net code and the Excel instance.

Eg, see the accepted answer here: Write Array to Excel Range



回答2:

Here is a piece of my own code that performs a very fast export of data from a DataTable to an Excel sheet (use the "Stopwatch" object to compare the speed and let me a comment):

Dim _excel As New Excel.Application
Dim wBook As Excel.Workbook
Dim wSheet As Excel.Worksheet

wBook = _excel.Workbooks.Add()
wSheet = wBook.ActiveSheet()


Dim dc As System.Data.DataColumn
Dim colIndex As Integer = 0
Dim rowIndex As Integer = 0
'Nombre de mesures
Dim Nbligne As Integer = DtMesures.Rows.Count

'Ecriture des entêtes de colonne et des mesures
'(Write column headers and data)

For Each dc In DtMesures.Columns
  colIndex = colIndex + 1
  'Entête de colonnes (column headers)
  wSheet.Cells(1, colIndex) = dc.ColumnName
  'Données(data)
  'You can use CDbl instead of Cobj If your data is of type Double
  wSheet.Cells(2, colIndex).Resize(Nbligne, ).Value = _excel.Application.transpose(DtMesures.Rows.OfType(Of DataRow)().[Select](Function(k) CObj(k(dc.ColumnName))).ToArray())
Next


回答3:

Even though the question was asked several years ago, I thought I would add my solution since the question was posed in VB and the "best answer" is in C#. This solution writes 22,000+ rows (1.9MB) in 4 seconds on an i7 System w/ 16GB RAM.


Imports Excel = Microsoft.Office.Interop.Excel

Public Class Main
    Private Sub btnExportToExcel(sender As Object, e As EventArgs) Handles btnExpToExcel.Click
        'Needed for the Excel Workbook/WorkSheet(s)
        Dim app As New Excel.Application
        Dim wb As Excel.Workbook = app.Workbooks.Add()
        Dim ws As Excel.Worksheet
        Dim strFN as String = "MyFileName.xlsx"    'must have ".xlsx" extension

        'Standard code for filling a DataTable from SQL Server
        Dim strSQL As String = "My SQL Statement for the DataTable"
        Dim conn As New SqlConnection With {.ConnectionString = "My Connection"}
        Dim MyTable As New DataTable
        Dim cmd As New SqlCommand(strSQL, conn)
        Dim da As New SqlDataAdapter(cmd)
        da.Fill(MyTable)

        'Add a sheet to the workbook and fill it with data from MyTable
        'You could create multiple tables and add additional sheets in a loop
        ws = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
        DataTableToExcel(MyTable, ws, strSym)

        wb.SaveAs(strFN)    'save and close the WorkBook
        wb.Close()

        MsgBox("Export complete.")
    End Sub

    Private Sub DataTableToExcel(dt As DataTable, ws As Excel.Worksheet, TabName As String)
        Dim arr(dt.Rows.Count, dt.Columns.Count) As Object
        Dim r As Int32, c As Int32
        'copy the datatable to an array
        For r = 0 To dt.Rows.Count - 1
            For c = 0 To dt.Columns.Count - 1
                arr(r, c) = dt.Rows(r).Item(c)
            Next
        Next

        ws.Name = TabName   'name the worksheet
        'add the column headers starting in A1
        c = 0
        For Each column As DataColumn In dt.Columns
            ws.Cells(1, c + 1) = column.ColumnName
            c += 1
        Next
        'add the data starting in cell A2
        ws.Range(ws.Cells(2, 1), ws.Cells(dt.Rows.Count, dt.Columns.Count)).Value = arr
    End Sub
End Class

Hope it helps.



回答4:

We had a VB.NET app that did exactly this, and took even longer for our users who were on slow PC's... sometimes 15 minutes.

The app is now an ASP/VB.NET app which simply builds an HTML table and outputs the result as an .xls extension... excel is able to read the HTML table and parse it into a grid format. You can still pass in XML for formatting and options, horizontal pane locking, etc.

If you don't have the option of using ASP.NET... try looking into a way to build an HTML table string and have excel parse & populate for you... much faster! I'm sure excel can parse other types as well.... XML, Arrays, HTML, etc... all would be quicker than manually building each row through VB.NET objects.