How to Map Input and Output Columns dynamically in

2020-01-29 10:16发布

问题:

I Have to Upload Data in SQL Server from .dbf Files through SSIS. My Output Column is fixed but the input column is not fixed because the files come from client and client may have updated data by his own style. there may be some unused column too or input column name can be different from output column.

One idea I had in my mind was to map files input column with output column in SQL Database table and use only those column which is present in the row for file id.

But I am not getting how to do that. can you suggest me for doing the same or else you have any idea?

Table Example.

+--------+---------------+--------------+--------+ | FileID | InputColumn | OutputColumn | Active | +--------+---------------+--------------+--------+ | 1 | CustCd | CustCode | 1 | +--------+---------------+--------------+--------+ | 1 | CName | CustName | 1 | +--------+---------------+--------------+--------+ | 1 | Address | CustAdd | 1 | +--------+---------------+--------------+--------+ | 2 | Cust_Code | CustCode | 1 | +--------+---------------+--------------+--------+ | 2 | Customer Name | CustName | 1 | +--------+---------------+--------------+--------+ | 2 | Location | CustAdd | 1 | +--------+---------------+--------------+--------+

回答1:

If you create a similar table, you can use it in 2 approaches to map columns dynamically inside SSIS package, or you must build the whole package programmatically. In this answer i will try to give you some insights on how to do that.

(1) Building Source SQL command with aliases

Note: This approach will only work if all .dbf files has the same columns count but the names are differents

In this approach you will generate the SQL command that will be used as source based on the FileID and the Mapping table you created. You must know is the FileID and the .dbf File Path stored inside a Variable. as example:

Assuming that the Table name is inputoutputMapping

Add an Execute SQL Task with the following command:

DECLARE @strQuery as VARCHAR(4000)

SET @strQuery = 'SELECT '

SELECT @strQuery = @strQuery + '[' + InputColumn + '] as [' + OutputColumn + '],'
FROM inputoutputMapping
WHERE FileID = ?

SET @strQuery = SUBSTRING(@strQuery,1,LEN(@strQuery) - 1) + ' FROM ' + CAST(? as Varchar(500))

SELECT @strQuery

And in the Parameter Mapping Tab select the variable that contains the FileID to be Mapped to the parameter 0 and the variable that contains the .dbf file name (alternative to table name) to the parameter 1

Set the ResultSet type to Single Row and store the ResultSet 0 inside a variable of type string as example @[User::SourceQuery]

The ResultSet value will be as following:

SELECT [CustCd] as [CustCode],[CNAME] as [CustName],[Address] as [CustAdd] FROM database1

In the OLEDB Source select the Table Access Mode to SQL Command from Variable and use @[User::SourceQuery] variable as source.


(2) Using a Script Component as Source

In this approach you have to use a Script Component as Source inside the Data Flow Task:

First of all, you need to pass the .dbf file path and SQL Server connection to the script component via variables if you don't want to hard code them.

Inside the script editor, you must add an output column for each column found in the destination table and map them to the destination.

Inside the Script, you must read the .dbf file into a datatable:

  • C# Read from .DBF files into a datatable
  • Load a DBF into a DataTable

After loading the data into a datatable, also fill another datatable with the data found in the MappingTable you created in SQL Server.

After that loop over the datatable columns and change the .ColumnName to the relevant output column, as example:

foreach (DataColumn col in myTable.Columns)
    {

    col.ColumnName = MappingTable.AsEnumerable().Where(x => x.FileID = 1 && x.InputColumn = col.ColumnName).Select(y => y.OutputColumn).First(); 

    }

After loop over each row in the datatable and create a script output row.

In addition, note that in while assigning output rows, you must check if the column exists, you can first add all columns names to list of string, then use it to check, as example:

var columnNames = myTable.Columns.Cast<DataColumn>()
                             .Select(x => x.ColumnName)
                             .ToList();  


foreach (DataColumn row in myTable.Rows){

if(columnNames.contains("CustCode"){

    OutputBuffer0.CustCode = row("CustCode");

}else{

    OutputBuffer0.CustCode_IsNull = True

}

//continue checking all other columns

}

If you need more details about using a Script Component as a source, then check one of the following links:

  • SSIS Script Component as Source
  • Creating a Source with the Script Component
  • Script Component as Source – SSIS
  • SSIS – USING A SCRIPT COMPONENT AS A SOURCE

(3) Building the package dynamically

I don't think there are other methods that you can use to achieve this goal except you has the choice to build the package dynamically, then you should go with:

  • BIML
  • .Net DTS Wrapper Classes
  • EzApi library

(4) SchemaMapper: C# schema mapping class library

Recently i started a new project on Git-Hub, which is a class library developed using C#. You can use it to import tabular data from excel, word , powerpoint, text, csv, html, json and xml into SQL server table with a different schema definition using schema mapping approach. check it out at:

  • SchemaMapper: C# Schema mapping class library

You can follow this Wiki page for a step-by-step guide:

  • Import data from multiple files into one SQL table step by step guide