SSIS Excel Import Forcing Incorrect Column Type

2019-01-10 17:23发布

问题:

I'm trying to import a spreadsheet to our database using SSIS. For some reason SSIS wants to believe two of the columns are of type Double, when they contain character data. I've tried remapping the columns to be nvarchar(255) but it still doesn't want to select the data it thinks is double, because there are characters in it. If I try to edit the SSIS package and change the column types in the Excel Source, it won't let me change the type of the columns in the Error Output and gives me an error if the regular output and error output columns don't match.

Why is SSIS insisting that these columns are Double? How can I force it to realize these are strings? Why does everything from microsoft have to not quite work correctly?

EDIT: I found this: http://support.microsoft.com/kb/236605

I sorted my data so that mixed data types would be at the top, and guess what: The problem reversed. Instead of not importing character data, it stopped importing purely numeric data. Apparently someone doesn't think 12345 can be represented as a string...

回答1:

I've seen this issue before, it's Excel that is the issue not SSIS. Excel samples the 1st few rows and then infers the data type even if you explicitly set it to text. What you need to do is put this into the Excel file connection string in the SSIS package. This instruction tells Excel that the columns contain mixed data types and hints it to do extra checking before deciding that the column is a numeric type when in fact it's not.

;Extended Properties="IMEX=1"

It should work with this (in most cases). The safer thing to do is export the Excel data to tab delimited text and use SSIS to import that.

  • http://blogs.acceleration.net/ryan/archive/2005/01/11/477.aspx


回答2:

You can convert (ie. force) the column data to text... Try this (Note: These instructions are based on Excel 2007)...

The following steps should force Excel to treat the column as text:

Open your spreadsheet with Excel.

Select the whole column that contains your "mostly numeric data" by clicking on the column header.

Click on the Data tab on the ribbon menu.

Select Text to Columns. This will bring up the Convert Text to Columns Wizard.

-On Step 1: Click Next

-On Step 2: Click Next

-On Step 3: Select Text and click Finish

Save your Excel sheet.

Retry the import using the SQL Server 2005 Import Data Wizard.

Also, here's a link to another question which has additional responses:

Import Data Wizard Does Not Like Data Type I Choose For A Column



回答3:

One thing that isn't mentioned in the accepted answer is that the "IMEX=1" parameter has to go inside the quoted portion of:

...;Extended Properties="...";


回答4:

;IMEX=1; is not always working... Everything about mixed datatypes in Excel: Mixed data types in Excel column



回答5:

Another workaround is to sort the spreadsheet with the character data at the top, thereby causing Excel to see the column as string, and importing everything as such.



回答6:

You can also alter the registry to look at more values than just the first 8 rows. I have used this method and works quite well.

http://support.microsoft.com/kb/281517



回答7:

Well IMEX=1 did not work for me. Neither did Reynier Booysen's suggestion. (I don't know if it makes a difference but I'm using SQL Server 2008r2). A good explanation of some workarounds and also some explanations of why IMEX=1 is limited to the first eight rows of each spreadsheet can be found at http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/78b87712-8ffe-4c72-914b-f1c031ba6c75

Hope this helps



回答8:

I've used the following recipe:

  1. Import data from Excel to Access
  2. Import data from Access to SQL Server

and it worked for me...



回答9:

I was banging my head against a wall with this issue for a while. In our environment, we consume price files from our suppliers in various formats, some of which have upward of a million records. This issue usually occurs where:

  • The rows scanned by the OLEDB driver appear to contain numbers, but do contain mixed values later on in the record set, or
  • Fields do contain only numbers, but the source has some formatted as text (usually Excel files).

The problem is that even if you set your external input column to the desired data type, the file gets scanned every time you run the package and is dynamically changed to whatever the OLEDB driver thinks the field should be.

Our source files typically contain field headers (text) and prices (numeric fields), which gives me an easy solution:

First step:

  • Change your SQL statement to include the header fields. This forces SSIS to see all fields as text, including the price fields.

For mixed fields:

  • Your initial problem is solved because your fields are now text, but you still have a header row in your output.
  • Prevent the header row from making it into your output by changing the SQL WHERE clause to exclude the header values e.g. "WHERE NOT([F4]='Price')"

For numeric fields:

  • Using the advanced editor for the OLE DB source, set the output column for the price field (or any other numeric field) to a numeric DataType. This causes any records that contain text in these fields to fail, including the header record, but forces a conversion on numeric values saved as text.

  • Set the Error Output to ignore failures on your numeric fields.

  • Alternatively, if you still need any errors on the numeric fields redirected, remove the header row by changing the SQL WHERE clause to exclude the header values then,

  • Set the Error Output to redirect failures on this field.

Obviously this method only works where you have header fields, but hopefully this helps some of you.



回答10:

Option 1. Use Visual Basic to iterate through each column and format each column as Text.

Use the Text-to-Columns menu, don't change the delimination, and change "General" to "Text"



回答11:

I had the same problem. The problem sit in the Excel Source task. When you setup this task the first time, the task will connect to the specified Excel file (via the Excel connection) and decide what type each column is based on the current spreadsheet.

Thus, if you set up the Excel Source task, just make sure that the columns that should be text only has text in the column. This means that the Excel Source task will always assume that any subsequent spreadsheets will have the same format and will read 12345 as text because the column was text when the task was set up.

Hope it makes sense!



回答12:

I had the same issue, multiple data type values in single column, package load only numeric values. Remains all it updated as null.

Solution

To fix this changing the excel data type is one of the solution. In Excel Copy the column data and paste in different file. Delete that column and insert new column as Text datatype and paste that copied data in new column.

Now in ssis package delete and recreate the Excel source and destination table change the column data type as varchar.

This will work.



回答13:

If multiple columns in the excel spreadsheet present with the same name, this kind of error occurs. The package will work after making the column name's distinct. Sometime the hidden columns are being ignored while checking the columnn names.



回答14:

  1. Click File on the ribbon menu, and then click on Options.
  2. Click Advanced, and then under When calculating this workbook, select the Set precision as displayed check box, and then click OK.

  3. Click OK.

  4. In the worksheet, select the cells that you want to format.

  5. On the Home tab, click the Dialog Box Launcher Button image next to Number.

  6. In the Category box, click Number.

  7. In the Decimal places box, enter the number of decimal places that you want to display.



回答15:

This worked for me. Select the problematic column in Excel - highlight the whole column. Change the format to "Text". Save the Excel file.

In your SSIS package, go to the Data Flow pane for your import. Double click the Excel Source node. It should warn you that the types have changed and ask you if you want to remap them. Click Yes. Executing should now work and bring in all values.

Note: I'm using Excel 2013 and Visual Studio 2015, but I assume these instructions would work for earlier versions too.



回答16:

It took me a bit to realize the source of the error in my package. Ultimately I found that data was converted to null (Example: from "06" to "NULL"), and I found this via Preview in the source file connection (Excel Source> Edit> Connection Manager> Sheet='MySheet'> Preview...). I got excited when I read the post by James to edit the connection string to have extended properties: ;Extended Properties="IMEX=1". But that did not work for me.

I was able to resolve the error by changing the Cell Format in Excel worksheet from “Number” to “Text”. After changing the format, the upload process ran successfully! My connection string looks like: Provider=Microsoft.ACE.OLEDB.12.0;Data Source=\\myServer\d$\Folder1\Folder2\myFile.xlsx;Extended Properties="EXCEL 12.0 XML;HDR=NO";

Here is are some screenshots that resolved my error message.

Error: Metadata of Excel file connection

Source of error: “General” format

Source of error changed: “Text” format

Error fixed: Metadata of Excel file connection