SQL Server stored procedure conversion to SSIS Pac

2019-04-29 04:51发布

问题:

Problem: currently we have numerous stored procedures (very long up to 10,000 lines) which were written by various developers for various requirements in last 10 years. It has become hard now to manage those complex/long stored procedures (with no proper documentation).

We plan to move those stored procedure into SSIS ETL package.

Has anybody done this is past? If yes, what approach should one take.

Appreciate if anybody could provide advise on approach to convert stored procedure into SSIS ETL Packages.

Thanks

回答1:

I've done this before, and what worked well for my team was to refactor incrementally, starting with the original source, and then iterate the refactoring effort.

The first step was to attempt to modularize the stored procedure logic into Execute SQL tasks that we chained together. Each task was tested and approved, then we'd integrate and ensure that the new process matched the results of the legacy procedures.

After this point, we could divide the individual Execute SQL tasks across the team, and load-balance the analysis of whether we could further refactor the SQL within the Execute SQL tasks to native SSIS tasks.

Each refactoring was individually unit tested and then integration tested to ensure that the overall process output still behaved like the legacy procedures.



回答2:

I would suggest the following steps:

  1. Analyze the stored procedures to identify the list of sources and destinations. For example: If the stored procedure dbo.TransferOrders moves data from table dbo.Order to dbo.OrderHistory. Then your source will be dbo.Order and destination will be dbo.OrderHistory.

  2. After you list out the sources and destinations, try to group the stored procedures according to your preference either by source/destination.

  3. Try to find out if there are any data transformations happening within the stored procedures. There are good data transformation tasks available within SSIS. You can evaluate and move some of those functionalities from stored procedures to SSIS. Since SSIS is a workflow kind of tool, I feel that it is easier to understand what is going inside the package than having to scroll through many lines of code to understand the functionality. But, that's just me. Preferences differ from person to person.

  4. Try to identify the dependencies within stored procedures and prepare a hierarchy. This will help in placing the tasks inside the package in appropriate order.

  5. If you have table named dbo.Table1 populating 5 different tables. I would recommend having them in a single package. Even if this data population being carried out by 5 different stored procedures, you don't need to go for 5 packages. Still, this again depends on your business scenario.

  6. SSIS project solution can have multiple packages within them and re-use data sources. You can use Execute SQL task available on the Control Flow task to run your existing queries but I would recommend that you also take a look at some of the nice transformation tasks available in SSIS. I have used them in my project and they function well for ETL operations.

These steps can be done by looking into one stored procedure at a time. You don't have to go through all of them at once.

Please have a look at some of the examples that I have given in other Stack Overflow questions. These should help you give an idea of what you can achieve with SSIS.

Copying data from one SQL table to another

Logging feature available in SSIS

Loading a flat file with 1 million rows into SQL tables using SSIS

Hope that helps.