At first glance I have the same problem as many persons had before and I found many questions and answers about my problem but none of them helped me.
I perform importing from MS excel file (file XLS) in NET by using ADO NET. The file contains mixed types in the same column: numbers and text, and the well known problem occurs - text format is not recognized and the data are lost.
I use the following connection string with recommended parameters:
string strConnectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=NO;IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text\";", pathname);
This is my code:
OleDbConnection sqlConn = new OleDbConnection(this.strConnectionString);
sqlConn.Open();
OleDbDataAdapter sqlAdapter = new OleDbDataAdapter();
string sql = "SELECT * FROM [" + sheetName + "]";
OleDbCommand selectCMD = new OleDbCommand(sql, sqlConn);
sqlAdapter.SelectCommand = selectCMD;
DataTable dt = new DataTable(sheetName);
sqlAdapter.Fill(dt);
I tested it under NET 3.5 (x86) and NET 4.0 (x86) (also tested as windows exe and asp net version), and the problem remains.
I don't know if I do something wrong but I've spent many hours and problem still remains.
Recently I found open source (The MIT License) NET library, which reads xls and xlsx files correctly. In this case I stop to use oledb drivers.
The library: link to project
You need to add some keys to the Windows Registry, they will force Excel engine to treat al columns as text.
Save the following code in a ForceExcelImportAsText.reg file, and then double click it, to add the keys to the Windows registry.
Also, remember to keep Excel CLOSED when you are importing your files.
I never understood why, but if Excel is opened and/or if the file I'm importing is opened, then Excel engine mess up the numeric/data columns, and then all you'll get are meaningless numbers instead of your data.
About the connection string
The connection string I use is
The parameter "Excel 8.0" is Ok for .xls files up to Excel 2003.
For Excel 2007-2013 .xlsx files I use "Excel 12.0 Xml"
For Excel 2007-2013 .xlsb files I use "Excel 12.0"
For Excel 2007-2013 .xlsm files I use "Excel 12.0 Macro"
Also, the data provider "Microsoft.Jet.OLEDB.4.0" is ok up to .xls Excel 2003 files.
For Excel 2007-2013 xlsx/xlsb/xlsm files you need to use the "Microsoft.ACE.OLEDB.12.0" data provider.
If the original goal is to import data and not to use ado.net at any costs then this is the solution that works for us:
If there is Excel installed on the machine doing the import we run small invisible automation script by which Excel converts the original data file into something readable. In our case into the xml. This is how the Xls2Xml.vbs file looks like
' VB Script Document if WScript.Arguments.Count < 2 Then WScript.Echo "Error! Please specify the source path and the destination. Usage: Xls2Xml SourcePath.xls Destination.xml" Wscript.Quit End If Dim oExcel Set oExcel = CreateObject("Excel.Application") Dim oBook Set oBook = oExcel.Workbooks.Open(Wscript.Arguments.Item(0)) oBook.SaveAs WScript.Arguments.Item(1), 46 oBook.Close False oExcel.Quit Wscript.Quit
The 46 magic number means xlXMLSpreadsheet. When the script finishes we then import the readable XML format (usually after processing it through a customer-specific XSLT script)If there will be no Excel automation server going to be available then we use XLSX as the data format and the cool open source ClosedXML library to read the data. Actually the library is better at creating XLSX files but it is not required in this question
Microsoft itself does not recommend access to legacy Excel proprietary format using legacy database drivers, mainly because of those many issues (e.g. the cell data format ambiguity). And any legacy solutions that rely on Excel automation are not guaranteed to work in the on-server scenarios.
In order to use the OpenXML SDK in production you just need to deploy single DocumentFormat.OpenXml.dll and you're ready to read/write XLSX data files