I have an Excel worksheet I want to read into a datatable - all is well except for one particular column in my Excel sheet. The column, 'ProductID', is a mix of values like ##########
and n#########
.
I tried to let OleDB handle everything by itself automatically by reading it into a dataset/datatable, but any values in 'ProductID' like n######
are missing, ignored, and left blank. I tried manually creating my DataTable by looping through each row with a datareader, but with the exact same results.
Here's the code :
// add the column names manually to the datatable as column_1, column_2, ...
for (colnum = 0; colnum < num_columns; colnum ++){
ds.Tables["products"].Columns.Add("column_" +colnum , System.Type.GetType("System.String"));
}
while(myDataReader.Read()){
// loop through each excel row adding a new respective datarow to my datatable
DataRow a_row = ds.Tables["products"].NewRow();
for (col = 0; col < num_columns; col ++){
try { a_row[col] = rdr.GetString(col); }
catch { a_row[col] = rdr.GetValue(col).ToString(); }
}
ds.Tables["products"].Rows.Add(a_row);
}
I don't understand why it won't let me read in values like n######
. How can I do this?
No problem sh4, glad it helps w/ the mixed type issue.
The DateTime column is whole other animal that I recall caused me grief in the past... we have one excel file we process that the OleDbDataAdapter will sometimes convert dates to a double data type (apparently Excel stores dates as doubles, which encode the number of days elapsed since January 0, 1900 ).
The workaround was to use:
@Brian Wells Thank you, your suggestion did the trick, but not totally... Worked for the mixed field int-string, but the datetime columns went with strange characters after that, so i applied a "hack" over the "hack".
1.- Do a System.Io.File.Copy and create a copy of the excel file.
2.- Modify the Datetime column headers programatically at runtime to something in datetime format, i.e. "01/01/0001".
3.- Save the excel, and then apply your trick doing the query with HDR=NO to the modified file.
Tricky, yes, but worked, and reasonabily fast, if anyone has any alternative to this, i will be glad to hear.
Greetings.
P.D. Excuse my english, it isn't my native language.
Using .Net 4.0 and reading Excel files, I had a similar issue with
OleDbDataAdapter
- i.e. reading in a mixed data type on a "PartID" column in MS Excel, where a PartID value can be numeric (e.g. 561) or text (e.g. HL4354), even though the excel column was formatted as "Text".From what I can tell, ADO.NET chooses the data type based on the majority of the values in the column (with a tie going to numeric data type). i.e. if most of the PartID's in the sample set are numeric, ADO.NET will declare the column to be numeric. Therefore ADO.Net will attempt to cast each cell to a number, which will fail for the "text" PartID values and not import those "text" PartID's.
My solution was to set the
OleDbConnection
connectionstring to useExtended Properties=IMEX=1;HDR=NO
to indicate this is an Import and that the table(s) will not include headers. The excel file has a header row, so in this case tell ado.net not to use it. Then later in the code, remove that header row from the dataset and voilà you have mixed data type for that column.// now you can use LINQ to search the fields
There are two ways to handle mixed datatypes & excel.
Method 1
Method 2
There is a "hack" that consists of appending "IMEX=1" to your connection string like so:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=myfile.xls;Extended Properties=Excel 8.0;IMEX=1
This will attempt to handle mixed Excel formats according to how it is set in your registry. This can be set locally by you, but for a server, this is probably not an option.
Several forums I found claim that by adding
IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text
to the Extended Properties in the connection string would fix the problem, but this was not the case. I finally solved this problem by adding "HDR=NO" to the Extended Properties in the connection string (as Brian Wells shows above) so that I could import mixed types.I then added some generic code to name the columns after the first row of data, then remove the first row.
Shortcut --> if you have a mixed type column in Excel: Sort your column Z to A
I pretty much went through all of the answers here and some of them worked for me and some did not, however none was desirable for me because somehow ADO did not pick the data in a mixed type column that I had in my Excel file. I had to set
HDR=NO
to make ADO read my spreadsheet column that is a mix of text and numbers and that way I lose the ability of using column headers in my SQL statements which is not good. If the order of columns change in the Excel file, the SQL statement will result in error or wrong output.In a mixed data type column the key is the first 8 rows. ADO determines the data type for the column based on the first 8 rows So if you still want to modify your connection string with the extended parameters, simply sort your column Z to A on your Excel file before reading the data by ADO so this way the rows on top are the text ones and then your column will be picked as text.
If your initial rows are numbers (regardless if your column is set to format TEXT in Excel) ADO will determine that columns as a numeric type, so once it read the text rows below, it cannot cast those into number. On the opposite case, if the column is determined text, if any row if number, it can be cast as text.