SSIS Excel Data Source - Is it possible to overrid

2019-02-08 06:53发布

When an excel data source is used in SSIS, the data types of each individual column are derived from the data in the columns. Is it possible to override this behaviour?

Ideally we would like every column delivered from the excel source to be string data type, so that data validation can be performed on the data received from the source in a later step in the data flow.

Currently, the Error Output tab can be used to ignore conversion failures - the data in question is then null, and the package will continue to execute. However, we want to know what the original data was so that an appropriate error message can be generated for that row.

标签: excel ssis
8条回答
贼婆χ
2楼-- · 2019-02-08 07:25

I could fix this issue. while creating the SSIS package, I manually changed the specific column to text (Open the excel file select the column, right click on column, select format cells, in number tab select Text and save the excel). Now create the SSIS package and test it. It works. Now try to use the excel file where this column was not set as text.

It worked for me and I could execute the package successfully.

查看更多
闹够了就滚
3楼-- · 2019-02-08 07:25

I had trouble implementing the solution here - I could follow the instructions, but it only gave new errors.

I solved my conversion issues by using a Data Conversion entity. This can be found on the SSIS Toolbox under Data Flow Transformations. I placed the Data Conversion between my Excel Source and OLE DB Destination, linked Excel to Data C, Data C to OLE DB, double clicked Data C to bring up a list of the data columns. Gave the problem column a new Alias, and changed the Data Type column.

Lastly, in the Mappings of the OLE DB Destination, use the Alias column name, rather than the original Excel column name. Job done.

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2019-02-08 07:29

Yes, you can. Just go into the output column list on the Excel source and set the type for each of the columns.

To get to the input columns list right click on the Excel source, select 'Show Advanced Editor', click the tab labeled 'Input and Output Properties'.

A potentially better solution is to use the derived column component where you can actually build "new" columns for each column in Excel. This has the benefits of

  1. You have more control over what you convert to.
  2. You can put in rules that control the change (i.e. if null give me an empty string, but if there is data then give me the data as a string)
  3. Your data source is not tied directly to the rest of the process (i.e. you can change the source and the only place you will need to do work is in the derived column)
查看更多
Summer. ? 凉城
5楼-- · 2019-02-08 07:29

I was looking for a solution for the similar issue, but didn't find anything on the internet. Although most of the found solutions work at design time, they don't work when you want to automate your SSIS package.

I resolved the issue and made it work by changing the properties of "Excel Source". By default the AccessMode property is set to OpenRowSet. If you change it to SQL Command, you can write your own SQL to convert any column as you wish.

For me SSIS was treating the NDCCode column as float, but I needed it as a string and so I used following SQL:

Select [Site], Cstr([NDCCode]) as NDCCode From [Sheet1$]

enter image description here

enter image description here

查看更多
6楼-- · 2019-02-08 07:38

According to this blog post, the problem is that the SSIS Excel driver determines the data type for each column based on reading values of the first 8 rows:

  • If the top 8 records contain equal number of numeric and character types – then the priority is numeric
  • If the majority of top 8 records are numeric then it assigns the data type as numeric and all character values are read as NULLs
  • If the majority of top 8 records are of character type then it assigns the data type as string and all numeric values are read as NULLs

The post outlines two things you can do to fix this:

  1. First, add IMEX=1 to the end of your Excel driver connection string. This will allow Excel to read the values as Unicode. However, this is not sufficient if the data in the first 8 rows are numeric.
  2. In the registry, change the value for HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Nod\Microsoft\Jet\4.0\Engines\Excel\TypeGuessRows to 0. This will ensure that the driver looks at all the rows to determine the data type for the column.
查看更多
成全新的幸福
7楼-- · 2019-02-08 07:38

Excel source is SSIS behaves crazy. SSIS determines the type of data in a particualr column by reading first 10 rows.. hence the issue. If you have a text column with null values in first 10 roes, SSIS takes the data type as Int. With a bit of struggle, here is a workaround

  1. Insert a dummy row (preferrably first row) in the worksheet. I prefer doing this thru a Script task, you may consider using some service to preprocess the file before SSIS connects to it

  2. With the duummy row, you are sure that the datatypes will be set as you need

  3. Read the data using Excel source and filter out the dummy row before you take it for further processing.

I know it is a bit shabby, but it works :)

查看更多
登录 后发表回答