SSIS import a Flat File to SQL with the first row

2020-04-21 00:29发布

问题:

I receive Text File that I have to Import to a SQL Table, I have to come with a SSIS because I will received the Flat File every Day , with the First Row as the Customer_ID, then come the invoice details and then the Total of the invoice.

Example :

30303

0000109291700080190432737000005Name of the product

0000210291700080190432737000010Name of the product

0000309291700080190432737000000Name of the product

003 000145

So let me Explain:

First 30303 is the Customer #

Other Rows Invoice Details

00001-> ROWID 092917-> DATE 000801904327->PROD 370->Trans 00010 -> AMOUNT
Name of the product

Last Row

003==>Total rows 000145==>Total of Invoice

Any Clue ?

回答1:

I would use a Script Component as a source in a Data Flow Task. You can then use C# or VB.net to read the file, e.g., by using System.IO.StreamReader, in any way you wish. You can read a line at a time, store values in variables to write to every row (e.g., the customer number), etc. It's extremely flexible for complex files.

Here is an example script (C#) based on your data:

public override void CreateNewOutputRows()
{
    System.IO.StreamReader reader = null;

    try
    {
        bool line1Read = false;
        int customerNumber = 0;

        reader = new System.IO.StreamReader(Variables.FilePath); // this refers to a package variable that contains the file path

        while (!reader.EndOfStream)
        {
            string line = reader.ReadLine();

            if (!line1Read)
            {
                customerNumber = Convert.ToInt32(line);
                line1Read = true;
            }
            else if (!reader.EndOfStream)
            {
                Output0Buffer.AddRow();

                Output0Buffer.CustomerNumber = customerNumber;
                Output0Buffer.RowID = Convert.ToInt32(line.Substring(0, 5));
                Output0Buffer.Date = DateTime.ParseExact(line.Substring(5, 6), "MMddyy", System.Globalization.CultureInfo.CurrentCulture);
                Output0Buffer.Prod = line.Substring(11, 12);
                Output0Buffer.Trans = Convert.ToInt32(line.Substring(23, 3));
                Output0Buffer.Amount = Convert.ToInt32(line.Substring(26, 5));
                Output0Buffer.ProductName = line.Substring(31);
            }
        }
    }
    catch
    {
        if (reader != null)
        {
            reader.Close();
            reader.Dispose();
        }

        throw;
    }
}

The columns in 'Output 0' of the Script Component are configured as follows:

Name             DataType                           Length
====             ========                           ======
CustomerNumber   four-byte signed integer [DT_I4]
RowID            four-byte signed integer [DT_I4]
Date             database date [DT_DBDATE]
Prod             string [DT_STR]                        12
Trans            four-byte signed integer [DT_I4]
Amount           four-byte signed integer [DT_I4]
ProductName      string [DT_STR]                       255

To implement this:

  • Create a string variable called 'FilePath' with your file path in it for the script to reference.
  • Create a Data Flow Task.
  • Add a Script Component to the Data Flow Task - you'll be asked what type it should be, select 'Source'.
  • Right-click the Script Component, click 'Edit'.
  • On the 'Script' pane, add the 'FilePath' variable to the 'ReadOnlyVariables' section.
  • On the 'Inputs and Outputs' pane, expand 'Output 0' and add columns to the 'Output Columns' section as per the above table.
  • On the 'Script' pane, click 'Edit Script', and then paste my code over the public override void CreateNewOutputRows() method (replacing it).
  • Your Script Component source is now configured, and you'll be able to use it like any other data source component. To write this data to a SQL Server table, add an OLEDB Destination to the Data Flow Task, and link the Script Component to that, configuring the columns appropriately.