Reverse engineering SSIS package using C#

2020-02-06 01:57发布

问题:

There is a requirement to extract source,destination and column names of source and destination. Why am I trying to do this is because I have thousands of packages and opening each package has on an average 60 to 75 of columns and listing all required info will take huge amount of time and its not a single time requirement and this task is done manually every two months in my organization currently.

I'm looking for some ways to reverse engineer keeping all packages in a single folder and then go through each package and get the info and put it in some spreadsheet.

I thought of opening package in xml and get the info of interested node and put in spreadsheet which is little cumbersome. Please suggest what are the available libraries to start with it.

回答1:

SQL server provide assemblies to manipulate packages programmatically.

To do a reverse engineering (deserialize a dtsx package), You have to do this by looping over packages and read them programmatically, just follow this detailed link

  • Reading DTS and SSIS packages programmatically

There is another way (harder way and not recommended) to achieve this , by reading dtsx as text file and parse the xml content. check my answer at the following question to get an example:

  • Automate Version number Retrieval from .Dtsx files

Hint:

just open the package in visual studio. go to the package explorer Tab (near control flow and data flow tabs) you will find a treeview. it will leads you the way you have to search for the component you need


Update 1 - C# Script @ 2019-07-08

If you are looking for a script that list all package objects you can use a similar script:

using System;
using DtsRuntime = Microsoft.SqlServer.Dts.Runtime;
using DtsWrapper = Microsoft.SqlServer.Dts.Pipeline.Wrapper;

public void Main()
{
    string pkgLocation;
    DtsRuntime.Package pkg;
    DtsRuntime.Application app;
    DtsRuntime. DTSExecResult pkgResults;

    pkgLocation =
      @"D:\Test\Package 1.dtsx";
    app = new DtsRuntime.Application();
    pkg = app.LoadPackage(pkgLocation, null);

    //List Executables (Tasks)
    foreach(DtsRuntime.Executable tsk in pkg.Executables)
    {


        DtsRuntime.TaskHost TH = (DtsRuntime.TaskHost)tsk;
        MessageBox.Show(TH.Name + "\t" + TH.HostType.ToString());


        //Data Flow Task components
        if (TH.InnerObject.ToString() == "System.__ComObject")
        {
            try
            {

                DtsWrapper.MainPipe m = (DtsWrapper.MainPipe)TH.InnerObject;


                DtsWrapper.IDTSComponentMetaDataCollection100 mdc = m.ComponentMetaDataCollection;


                foreach (DtsWrapper.IDTSComponentMetaData100 md in mdc)


                {

                    MessageBox.Show(TH.Name.ToString() + " - " + md.Name.ToString());


                }

            }
            catch {

            // If it is not a data flow task then continue foreach loop

            }



        }



    }

    //Event Handlers
    foreach(DtsRuntime.DtsEventHandler eh in pkg.EventHandlers)
    {

        MessageBox.Show(eh.Name + " - " + CM.HostType);

    }

    //Connection Manager

    foreach(DtsRuntime.ConnectionManager CM in pkg.Connections)
    {

        MessageBox.Show(CM.Name + " - " + CM.HostType);


    }


    //Parameters
    foreach (DtsRuntime.Parameter Param in pkg.Parameters)
    {

        MessageBox.Show(Param.Name + " - " + Param.DataType.ToString());


    }


    //Variables
    foreach (DtsRuntime.Variable Var in pkg.Variables)
    {

        MessageBox.Show(Var.Name + " - " + Var.DataType.ToString());


    }

    //Precedence Constraints
    foreach (DtsRuntime.PrecedenceConstraint PC in pkg.PrecedenceConstraints)
    {

        MessageBox.Show(PC.Name);


    }

}

References

  • Loading and Running a Local Package Programmatically

Update 2 - SSISPackageExplorer Project @ 2019-07-10

I started a small project called SSISPackageExplorer on Git-Hub which allow the user to read the package objects in a TreeView, It is very basic right now but i will try to improve it in a while:

  • GitHub - SSISPackageExplorer



回答2:

Some of the properties in dtsx Microsoft.SqlServer.Dts.Pipeline are not CLS-compliant.

  • ColumnInformation Constructors
ColumnInformation Class 
Definition

Namespace:
Microsoft.SqlServer.Dts.Pipeline 
Assembly:
Microsoft.SqlServer.PipelineHost.dll
Important
This API is not CLS-compliant.
C++

Copy
public ref class ColumnInformation

otherwise try this.

Just open your dtsx package in notepad++. Find table name then do the same search on the property name in all packages( find ion all files). I think that even if you search for the column in dtsx opened in a text editor it will give you everything. It's manual but can be updated with Regex and c#. I never did it with regex. I just did notepad++ and one package once.