SSIS : delete rows after an update or insert

2020-04-17 06:29发布

问题:

Here is the following situation: I have a table of StudentsA which needs to be synchronized with another table, on a different server, StudentsB. It's a one-way sync from A to B. Since the table StudentsA can hold a large number of rows, we have a table called StudentsSync (on the input server) containing the ID of StudentsA which have been modified since the last copy from StudentsA to StudentsB.

I made the following SSIS Data Flow task:

The only problem is that I need to DELETE the row from StudentsSync after a successful copy or update. Something like this:

Any idea how this can be achieved?

回答1:

It can be achieved using 3 methods

1.If your target table in OutputDB has TimeStamp columns such as Create and modified TimeStamp then rows which have got updated or inserted can be obtained by writing a simple query. You need to write the below query in the execte sql task in Control Flow to delete those rows in Sync Table .

Delete from SyncTable
where keyColumn in (Select primary_key from target 
where ModifiedTimeStamp >= GETDATE() or (ModifiedTimeStamp is null
and CreateTimeStamp>=GETDATE()))

I assume StudentsA's primary key is present in Sync table along with primary key of Target table. The above condition basically checks, if a new row is added then CreateTimeStamp column will have current date and modifiedTimeStamp will be null else if the values are updated then the modifiedTimeStamp will have current date

The above query will work if you have TimeStamp columns in your target table which i feel should be there if your loading data into Data Warehouse

2.You can use MERGE syntax to perform the update and insert in Control Flow with Execute SQL Task.No need to use Data Flow Task .The below query can be used even if you don't have any TimeStamp columns

DECLARE @Output TABLE ( ActionType VARCHAR(20), SourcePrimaryKey INT)

MERGE StudentsB  AS TARGET
USING StudentsA  AS SOURCE 
ON (TARGET.CommonColumn = SOURCE.CommonColumn) 

WHEN MATCHED 
THEN 
UPDATE SET TARGET.column = SOURCE.Column,TARGET.ModifiedTimeStamp=GETDATE()

WHEN NOT MATCHED BY TARGET THEN 
INSERT (col1,col2,Col3) 
VALUES (SOURCE.col1, SOURCE.col2, SOURCE.Col3)

OUTPUT $action, 
INSERTED.PrimaryKey AS SourcePrimaryKey INTO @Output

Delete from SyncTable
where PrimaryKey in (Select SourcePrimaryKey from @Output
                     where ActionType ='INSERT' or ActionType='UPDATE')

The code is not tested as i'm running out of time .but at-least it should give you a fair idea how to proceed . .For furthur detail on MERGE syntax read this and this

3.Use Multicast Component to duplicate the dataset for Insert and Update .Connect a MULTICAST to lookmatch output and another multicast to Lookup No match output



回答2:

  1. Add a task after "Update existing entry" and after "Insert new entry" to add the student ID to a variable which will contain the list of IDs to delete.
  2. Enclose all of the tasks in a sequence container.
  3. After the sequence container executes add a task to delete all the records from the sync table that are in that variable you've been populating.