Remove duplicates in SSIS Data Flow

2019-02-17 08:00发布

站内文章 / 移动开发

12 0

啃猪蹄的小仙女

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am working on an SSIS data flow task.

The source table is from old database which is denormalized.

The destination table is normalized.

SSIS fails because the data transfer is not possible because of duplicates (duplicates in primary key column).

It would be good if the SSIS can checks the destination for availability of current record (by checking the key) and if it exists , it can ignore pushing it. Then it can continue with the next record.

Is there a way to handle this scenario?

回答1:

Assuming your destination table is a subset of your source table, you should be able to use the Sort Transformation to pull in only the columns you need for your destination table, and then check the "Remove rows with duplicate sort values" to basically give you a distinct list of records based on the columns you selected.

Then, simply route the results of the sort to your destination, and you should be good to go.

标签： ssis