SSIS Excel Source Connection. What does it use to

2019-08-28 06:14发布

问题:

Using SSIS, I Can use an OLE Connect with a Jet driver to read Excel or I can use a separate Connection type of "Excel Connection"

Both seem to have issues reading files with merges cells, as I feared.

I am curious what SSIS is using to connect to Excel when an "Excel Connection" is used.

Other than VBA, what would you suggest for reading an Excel file that has merged cells, formulas, formatting, etc, on a server? I am using Excel 2003.

Update

Here's the code that I use to read the XLS:

private static void GetExcelSheets(string filePath)
        {
            string connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties='Excel 8.0;HDR={1}'", filePath, "no");

            OleDbConnection excelConnection = new OleDbConnection(connectionString);
            OleDbCommand cmdExcel = new OleDbCommand();
            OleDbDataAdapter oda = new OleDbDataAdapter();
            cmdExcel.Connection = excelConnection;

            if (excelConnection.State == ConnectionState.Open)
            {
                excelConnection.Close();
            }
            excelConnection.Open();

            OleDbCommand oleDbCommand = new OleDbCommand();
            oleDbCommand.CommandType = System.Data.CommandType.Text;
            oleDbCommand.Connection = excelConnection;
            OleDbDataAdapter oleDbDataAdapter = new OleDbDataAdapter(oleDbCommand);

            DataTable dtExcelSheetName = excelConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
            string getExcelSheetName = dtExcelSheetName.Rows[5]["Table_Name"].ToString();
            oleDbCommand.CommandText = "SELECT * FROM [" + getExcelSheetName + "]";
            oleDbDataAdapter.SelectCommand = oleDbCommand;
            DataTable dtExcelRecords = new DataTable();
            oleDbDataAdapter.Fill(dtExcelRecords);

            excelConnection.Dispose();

        }

When I read a spreadsheet that looks like this:

and then use a debugger Visualizer to show the dataset, I see this. Note that the data in the source spreadsheet that I circled is missing from the displayed DataTable:

Regarding the point about the Excel Connection using Microsoft OLE DB Provider for Jet 4.0. In SSIS, there are two separate connection objects. The first is an Excel Connect Object. It's Property page looks like this:

The second is an OLE Connection object using a JET Driver and pointing to Excel.

Is the Excel Connection object really uses the 4.0 Jet driver, is it just a shorthand way of doing the same thing or are these conenction types really different in some way?

回答1:

According to Microsoft:

The Excel connection manager uses the Microsoft OLE DB Provider for Jet 4.0 and its supporting Excel ISAM (Indexed Sequential Access Method) driver to connect and read and write data to Excel data sources.

Source: MSDN


As for how to properly import Excel sheets with merged cells: you need to make sure that you have the left-most column being mapped in your transformations in order to get the value. Any cell that is not the left-most in a merged set of cells will be null. I think this is a reasonable expectation, as SSIS is typically used with database sources, which have no concept of "merged" cells.