How to load a flat file with header and detail dat

2019-01-07 22:07发布

I have to load a flat file that has different header and detail with variable number of columns. These have parent child relations. How to load the data into SQL Server? The file looks like this:

DEP*0116960*20110511***01*061000104*DA*1000022220940
AMT*3*13006.05
QTY*41*3
QTY*42*5
BAT*20110511**STAWRRY11051101
AMT*2*9332.33
QTY*42*2
BPR*I*4799*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001002
BPR*I*4533.33*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001001
BAT*20110511**STAWRRY11051102
AMT*2*1986.99
QTY*42*2
BPR*I*853.97*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736416
RMR*ST*00090183**853.97*12199.61
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110430
BPR*I*1133.02*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736417
RMR*ST*00090184**1133.02*16186.04
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110430
BAT*20110511**STAWRRY11051103
AMT*2*1686.73
QTY*42*1
BPR*I*1686.73*C*CHK*PBC*01*075911742*DA*0100461755
REF*CK*0000002795
RMR*ST*00094075**1686.73*42168.16
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110331
SE*39*000000088
GE*1*88
IEA*1*000000088 

标签: ssis edi
3条回答
爷的心禁止访问
2楼-- · 2019-01-07 22:27

You can use a script component source to build the rules on handling the file. That's well described here and here. You may be able to use the EDI Source from CozyRoc, but I don't believe they have indicated it has been tested with EDI 823.

查看更多
3楼-- · 2019-01-07 22:37

Here is one possible way of loading this file into SQL Server. Below shown example reads the contents of EDI 823 Lockbox file and loads into multiple tables along with the relationship. I am sure that there are other better ways of doing this. This is just one example of loading an EDI file into SQL Server.

The example reads the EDI file line by line and then splits them based on the character asterisk (*). The script component assigns the value to the variables in the Script Component to populate the tables. After the data is populated by the Data Flow Task, the Execute SQL Task will update the ParentId column in the tables dbo.AMT, dbo.DTM, dbo.QTY and dbo.REF using the stored procedure dbo.UpdateHierarchy. The table 'ISA' contains the data of all other segments. Screenshots show how the data is stored in the child tables.

SetId columns used in all the tables will group data of a file in order to avoid wrong parent id being referred from data pulled from another file. SetId will be unique for each file loaded into these tables.

Step-by-step process:

  1. Create 5 tables named dbo.AMT, dbo.DTM, dbo.ISA, dbo.QTY and dbo.REF and a stored procedure named dbo.UpdateHierarchy in the database using the scripts provided under SQL Scripts section. Tables AMT, DTM, QTY and REF will store the data of the segments named similarly and the table ISA will contain all other segment data.

  2. Create an OLE DB Connection named SQLServer to connect to the SQL Server instance and create a Flat File Connection named Source as shown in screenshots #""1"" - #4. Flat File connection will use the EDI file. Remove the Column delimiter because the file has varying number of elements. This example will split the elements using the Script Component.

  3. On the SSIS package, create 5 variables as shown in screenshot #5. Also, place a Data Flow task and an Execute SQL Task on the Control Flow tab of the package as shown in screenshot #5.

  4. Configure the Data Flow Task with Flat File Source, Script Component, Multicast, Conditional Split and 5 OLE DB Destinations as shown in screenshot #6.

  5. Configure the Flat File Source to read the EDI file using the Flat File connection name Source.

  6. Configure the Script Component as Transformation task as shown in screenshots #7 and #8. Create all the other variables using the names and data types using the data provided under Script Component Variables section. Replace the class ScriptMain present within the Script Component Transformation task with the code provided under Script Component Code.

  7. Configure the Conditional Split as shown in screenshot #9.

  8. Configure the OLE DB Destination AMT to insert data into the table dbo.AMT and map fields as shown in screenshot #10.

  9. Configure the OLE DB Destination QTY to insert data into the table dbo.QTY and map fields as shown in screenshot #11.

  10. Configure the OLE DB Destination REF to insert data into the table dbo.REF and map fields as shown in screenshot #12.

  11. Configure the OLE DB Destination DTM to insert data into the table dbo.DTM and map fields as shown in screenshot #13.

  12. Configure the OLE DB Destination Other to insert data into the table dbo.ISA and map fields as shown in screenshot #14.

  13. On the Data Flow tab, configure the Execute SQL Task as shown in screenshot #15.

  14. Screenshots #16 and #17 shows sample package execution. File data used for testing this example is provided under section EDI File Content.

  15. Screenshot #18 shows data in table dbo.AMT in comparison with data in table dbo.ISA.

  16. Screenshot #19 shows data in table dbo.QTY in comparison with data in table dbo.ISA.

  17. Screenshot #20 shows data in table dbo.REF in comparison with data in table dbo.ISA.

  18. Screenshot #21 shows data in table dbo.DTM in comparison with data in table dbo.ISA.

Hope that helps.

SQL Scripts:

CREATE TABLE [dbo].[AMT](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [ParentId] [int] NULL,
    [LineNumber] [int] NULL,
    [AmountQualifierCode] [varchar](3) NULL,
    [MonetaryAmount] [numeric](18, 2) NULL,
    [SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_AMT] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO

CREATE TABLE [dbo].[DTM](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [ParentId] [int] NULL,
    [LineNumber] [int] NULL,
    [DateTimeQualifier] [varchar](3) NULL,
    [Date] [varchar](10) NULL,
    [Time] [varchar](10) NULL,
    [TimeCode] [varchar](2) NULL,
    [PeriodFormatQualifier] [varchar](3) NULL,
    [DateTimePeriod] [varchar](35) NULL,
    [SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_DTM] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO

CREATE TABLE [dbo].[ISA](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [LineNumber] [int] NULL,
    [SegmentCode] [varchar](3) NULL,
    [ReferenceId] [varchar](30) NULL,
    [Date] [varchar](10) NULL,
    [Time] [varchar](10) NULL,
    [IdNumberQualifier] [varchar](2) NULL,
    [IdentificationNumber] [varchar](12) NULL,
    [AccountNumberQualifier] [varchar](3) NULL,
    [AccountNumber] [varchar](35) NULL,
    [TransactionHandlingCode] [varchar](2) NULL,
    [CreditDebitFlag] [varchar](1) NULL,
    [PaymentMethod] [varchar](3) NULL,
    [ReferenceIdQualifier] [varchar](3) NULL,
    [MonetaryAmount1] [numeric](18, 2) NULL,
    [MonetaryAmount2] [numeric](18, 2) NULL,
    [AuthorizeInfoQualifier] [varchar](2) NULL,
    [AuthorizeInfo] [varchar](10) NULL,
    [SecurityInfoQualifier] [varchar](2) NULL,
    [SecurityInfo] [varchar](10) NULL,
    [InterchangeSenderIdQualifier] [varchar](2) NULL,
    [InterchangeSenderId] [varchar](15) NULL,
    [InterchangeReceiverIdQualifier] [varchar](2) NULL,
    [InterchangeReceiverId] [varchar](15) NULL,
    [InterchangeStandardsId] [varchar](1) NULL,
    [InterchangeVersionId] [varchar](5) NULL,
    [InterchangeControlNumber] [varchar](9) NULL,
    [Acknowledge] [varchar](1) NULL,
    [TestIndicator] [varchar](1) NULL,
    [SubElementSeparator] [varchar](1) NULL,
    [FunctionalId] [varchar](2) NULL,
    [ApplicationSenderCode] [varchar](12) NULL,
    [ApplicationReceiverCode] [varchar](12) NULL,
    [Responsbility] [varchar](2) NULL,
    [Version] [varchar](12) NULL,
    [TransactionIdCode] [varchar](3) NULL,
    [TransactionSetControlNumber] [varchar](9) NULL,
    [Counter] [int] NULL,
    [SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_ISA] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO

CREATE TABLE [dbo].[QTY](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [ParentId] [int] NULL,
    [LineNumber] [int] NULL,
    [QuantityQualifier] [varchar](2) NULL,
    [Quantity] [numeric](15, 0) NULL,
    [SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_QTY] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO

CREATE TABLE [dbo].[REF](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [ParentId] [int] NULL,
    [LineNumber] [int] NULL,
    [ReferenceIdQualifier] [varchar](3) NULL,
    [ReferenceId] [varchar](30) NULL,
    [SetId] [uniqueidentifier] NULL,
CONSTRAINT [PK_REF] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO

CREATE PROCEDURE [dbo].[UpdateHierarchy]
AS
BEGIN
    SET NOCOUNT ON;

    UPDATE      AMT
    SET         AMT.ParentId = COALESCE(PAR.ParentId, 0) 
    FROM        dbo.AMT AMT
    CROSS APPLY (   
                    SELECT  MAX(Id) AS ParentId
                    FROM    dbo.ISA PAR
                    WHERE   PAR.SetId       = AMT.SetId
                    AND     PAR.LineNumber  < AMT.LineNumber
                    AND     PAR.SegmentCode IN ('DEP', 'BAT')
                ) PAR;

    UPDATE      QTY
    SET         QTY.ParentId = COALESCE(PAR.ParentId, 0) 
    FROM        dbo.QTY QTY
    CROSS APPLY (   
                    SELECT  MAX(Id) AS ParentId
                    FROM    dbo.ISA PAR
                    WHERE   PAR.SetId       = QTY.SetId
                    AND     PAR.LineNumber  < QTY.LineNumber
                    AND     PAR.SegmentCode IN ('DEP', 'BAT')
                ) PAR;

    UPDATE      REF
    SET         REF.ParentId = COALESCE(PAR.ParentId, 0) 
    FROM        dbo.REF REF
    CROSS APPLY (   
                    SELECT  MAX(Id) AS ParentId
                    FROM    dbo.ISA PAR
                    WHERE   PAR.SetId       = REF.SetId
                    AND     PAR.LineNumber  < REF.LineNumber
                    AND     PAR.SegmentCode IN ('BPR', 'RMR')
                ) PAR;

    UPDATE      DTM
    SET         DTM.ParentId = COALESCE(PAR.ParentId, 0) 
    FROM        dbo.DTM DTM
    CROSS APPLY (   
                    SELECT  MAX(Id) AS ParentId
                    FROM    dbo.ISA PAR
                    WHERE   PAR.SetId       = DTM.SetId
                    AND     PAR.LineNumber  < DTM.LineNumber
                    AND     PAR.SegmentCode IN ('BPR', 'RMR')
                ) PAR;
END
GO

Script Component Variables:

S.no.   Variable name                   Data Type                       Length/Precision
1.      LineNumber                      four-byte signed integer [DT_I4]    
2.      ParentId                        four-byte signed integer [DT_I4]
3.      SegmentCode                     string [DT_STR]                     3
4.      ReferenceId                     string [DT_STR]                     30
5.      Date                            string [DT_STR]                     10
6.      Time                            string [DT_STR]                     10
7.      IdNumberQualifier               string [DT_STR]                     2
8.      IdentificationNumber            string [DT_STR]                     12
9.      AccountNumberQualifier          string [DT_STR]                     3
10.     AccountNumber                   string [DT_STR]                     35
11.     AmountQualifierCode             string [DT_STR]                     3
12.     TransactionHandlingCode         string [DT_STR]                     2
13.     CreditDebitFlag                 string [DT_STR]                     1
14.     PaymentMethod                   string [DT_STR]                     3
15.     DateTimeQualifier               string [DT_STR]                     3
16.     TimeCode                        string [DT_STR]                     2
17.     PeriodFormatQualifier           string [DT_STR]                     3
18.     DateTimePeriod                  string [DT_STR]                     35
19.     QuantityQualifier               string [DT_STR]                     2
20.     Quantity                        numeric [DT_NUMERIC]                15, 0
21.     ReferenceIdQualifier            string [DT_STR]                     3
22.     MonetaryAmount1                 numeric [DT_NUMERIC]                18,2
23.     MonetaryAmount2                 numeric [DT_NUMERIC]                18,2
24.     AuthorizeInfoQualifier          string [DT_STR]                     2   
25.     AuthorizeInfo                   string [DT_STR]                     10      
26.     SecurityInfoQualifier           string [DT_STR]                     2
27.     SecurityInfo                    string [DT_STR]                     2
28.     InterchangeSenderIdQualifier    string [DT_STR]                     2
29.     InterchangeSenderId             string [DT_STR]                     15
30.     InterchangeReceiverIdQualifier  string [DT_STR]                     2
31.     InterchangeReceiverId           string [DT_STR]                     15
32.     InterchangeStandardsId          string [DT_STR]                     1
33.     InterchangeVersionId            string [DT_STR]                     5
34.     InterchangeControlNumber        string [DT_STR]                     9
35.     Acknowledge                     string [DT_STR]                     1
36.     TestIndicator                   string [DT_STR]                     1
37.     SubElementSeparator             string [DT_STR]                     1
38.     FunctionalId                    string [DT_STR]                     2
39.     ApplicationSenderCode           string [DT_STR]                     12
40.     ApplicationReceiverCode         string [DT_STR]                     12
41.     Responsbility                   string [DT_STR]                     2
42.     Version                         string [DT_STR]                     12
43.     TransactionIdCode               string [DT_STR]                     3
44.     TransactionSetControlNumber     string [DT_STR]                     9
45.     Counter                         four-byte signed integer [DT_I4]
46.     SetId                           unique identifier [DT_GUID]

Script Component Code:

VB.NET code that can be used in SSIS 2005 and above.

Public Class ScriptMain
    Inherits UserComponent

    Public Enum Segments
        ISA
        GS
        ST
        DEP
        AMT
        QTY
        BAT
        BPR
        REF
        RMR
        DTM
        SE
        GE
        IEA
    End Enum

    Dim arrLine As String()
    Dim segmentCode As String
    Dim counter As Integer
    Dim lineNumber As Integer = 1
    Dim segmentSequence As Hashtable = New Hashtable()
    Dim setId As Guid = Guid.NewGuid()

    Public Overrides Sub PreExecute()
        MyBase.PreExecute()
    End Sub

    Public Overrides Sub PostExecute()
        MyBase.PostExecute()
    End Sub

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

        arrLine = Row.LineText.Split("*"c)
        segmentCode = SegmentValue(arrLine, 0)
        Row.SegmentCode = segmentCode
        Row.LineNumber = lineNumber
        Row.SetId = setId
        counter = arrLine.Length
        lineNumber += 1

        Select Case (segmentCode.ToUpper())

            Case Segments.ISA.ToString.ToUpper()
                Row.AuthorizeInfoQualifier = SegmentValue(arrLine, 1)
                Row.AuthorizeInfo = SegmentValue(arrLine, 2)
                Row.SecurityInfoQualifier = SegmentValue(arrLine, 3)
                Row.SecurityInfo = SegmentValue(arrLine, 4)
                Row.InterchangeSenderIdQualifier = SegmentValue(arrLine, 5)
                Row.InterchangeSenderId = SegmentValue(arrLine, 6)
                Row.InterchangeReceiverIdQualifier = SegmentValue(arrLine, 7)
                Row.InterchangeReceiverId = SegmentValue(arrLine, 8)
                Row.Date = SegmentValue(arrLine, 9)
                Row.Time = SegmentValue(arrLine, 10)
                Row.InterchangeStandardsId = SegmentValue(arrLine, 11)
                Row.InterchangeVersionId = SegmentValue(arrLine, 12)
                Row.InterchangeControlNumber = SegmentValue(arrLine, 13)
                Row.Acknowledge = SegmentValue(arrLine, 14)
                Row.TestIndicator = SegmentValue(arrLine, 15)
                Row.SubElementSeparator = SegmentValue(arrLine, 16)

            Case Segments.GS.ToString.ToUpper()
                Row.FunctionalId = SegmentValue(arrLine, 1)
                Row.ApplicationSenderCode = SegmentValue(arrLine, 2)
                Row.ApplicationReceiverCode = SegmentValue(arrLine, 3)
                Row.Date = SegmentValue(arrLine, 4)
                Row.Time = SegmentValue(arrLine, 5)
                Row.InterchangeControlNumber = SegmentValue(arrLine, 6)
                Row.Responsbility = SegmentValue(arrLine, 7)
                Row.Version = SegmentValue(arrLine, 8)

            Case Segments.ST.ToString.ToUpper()
                Row.TransactionIdCode = SegmentValue(arrLine, 1)
                Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)

            Case Segments.DEP.ToString.ToUpper()
                Row.ReferenceId = SegmentValue(arrLine, 1)
                Row.Date = SegmentValue(arrLine, 2)
                Row.IdNumberQualifier = SegmentValue(arrLine, 5)
                Row.IdentificationNumber = SegmentValue(arrLine, 6)
                Row.AccountNumberQualifier = SegmentValue(arrLine, 7)
                Row.AccountNumber = SegmentValue(arrLine, 8)

            Case Segments.AMT.ToString.ToUpper()
                Row.AmountQualifierCode = SegmentValue(arrLine, 1)
                Row.MonetaryAmount1 = SegmentValue(arrLine, 2)

            Case Segments.QTY.ToString.ToUpper()
                Row.QuantityQualifier = SegmentValue(arrLine, 1)
                Row.Quantity = Convert.ToDecimal(SegmentValue(arrLine, 2))

            Case Segments.BAT.ToString.ToUpper()
                Row.Date = SegmentValue(arrLine, 1)
                Row.ReferenceId = SegmentValue(arrLine, 3)

            Case Segments.BPR.ToString.ToUpper()
                Row.TransactionHandlingCode = SegmentValue(arrLine, 1)
                Row.MonetaryAmount1 = SegmentValue(arrLine, 2)
                Row.CreditDebitFlag = SegmentValue(arrLine, 3)
                Row.PaymentMethod = SegmentValue(arrLine, 4)
                Row.IdNumberQualifier = SegmentValue(arrLine, 6)
                Row.IdentificationNumber = SegmentValue(arrLine, 7)
                Row.AccountNumber = SegmentValue(arrLine, 9)

            Case Segments.REF.ToString.ToUpper()
                Row.ReferenceIdQualifier = SegmentValue(arrLine, 1)
                Row.ReferenceId = SegmentValue(arrLine, 2)

            Case Segments.RMR.ToString.ToUpper()
                Row.ReferenceIdQualifier = SegmentValue(arrLine, 1)
                Row.ReferenceId = SegmentValue(arrLine, 2)
                Row.MonetaryAmount1 = SegmentValue(arrLine, 4)
                Row.MonetaryAmount2 = SegmentValue(arrLine, 5)

            Case Segments.DTM.ToString.ToUpper()
                Row.DateTimeQualifier = SegmentValue(arrLine, 1)
                Row.Date = SegmentValue(arrLine, 2)
                Row.Time = SegmentValue(arrLine, 3)
                Row.TimeCode = SegmentValue(arrLine, 4)
                Row.PeriodFormatQualifier = SegmentValue(arrLine, 5)
                Row.DateTimePeriod = SegmentValue(arrLine, 6)

            Case Segments.SE.ToString.ToUpper()
                Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
                Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)

            Case Segments.GE.ToString.ToUpper()
                Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
                Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)

            Case Segments.IEA.ToString.ToUpper()
                Row.Counter = Convert.ToInt32(SegmentValue(arrLine, 1))
                Row.TransactionSetControlNumber = SegmentValue(arrLine, 2)
        End Select

    End Sub

    Public Function SegmentValue(ByRef LineArray As String(), ByVal Counter As Integer) As String
        If LineArray.Length > Counter Then
            Return LineArray(Counter).ToString().Trim()
        End If
        Return String.Empty
    End Function

End Class

EDI File Content:

ISA*00* *00* *12*2562379521 *08*YOUR ID *19981223*1056*U*00401*000000017*0*T*>
GS*PD*2562379521*YOUR ID*19981223*1056*000000017*X*004010VICS
ST*852*000000001
DEP*0116960*20110511***01*061000104*DA*1000022220940
AMT*3*13006.05
QTY*41*3
QTY*42*5
BAT*20110511**STAWRRY11051101
AMT*2*9332.33
QTY*42*2
BPR*I*4799*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001002
BPR*I*4533.33*C*CHK*PBC*01*011500010*DA*394001464351
REF*CK*0000001001
BAT*20110511**STAWRRY11051102
AMT*2*1986.99
QTY*42*2
BPR*I*853.97*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736416
RMR*ST*00090183**853.97*12199.61
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110430
BPR*I*1133.02*C*CHK*PBC*01*111000614*DA*708340062
REF*CK*0010736417
RMR*ST*00090184**1133.02*16186.04
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110430
BAT*20110511**STAWRRY11051103
AMT*2*1686.73
QTY*42*1
BPR*I*1686.73*C*CHK*PBC*01*075911742*DA*0100461755
REF*CK*0000002795
RMR*ST*00094075**1686.73*42168.16
REF*BE*01*0123181825*0 
REF*SL*NA*191219012318
DTM*810*20110331
SE*39*000000088
GE*1*88
IEA*1*000000088 

Screenshot #1:

1

Screenshot #2:

2

Screenshot #3:

3

Screenshot #4:

4

Screenshot #5:

5

Screenshot #6:

6

Screenshot #7:

7

Screenshot #8:

8

Screenshot #9:

9

Screenshot #10:

10

Screenshot #11:

11

Screenshot #12:

12

Screenshot #13:

13

Screenshot #14:

14

Screenshot #15:

15

Screenshot #16:

16

Screenshot #17:

17

Screenshot #18:

18

Screenshot #19:

19

Screenshot #20:

20

Screenshot #21:

21

查看更多
时光不老,我们不散
4楼-- · 2019-01-07 22:37

This is why translation engines are available and so widely used. I love the detail of the solution above, but also feel like there's some reinvention of the wheel.

Take a look at Jitterbit. It is open source, and pretty user friendly for data transformations.

You can model your source, model your target and then do a transformation. Based on your models, you'll have much more control over the transformation that what SSIS will give you. Sure there's a learning curve, but you can automate the communication / transformation. There also would be ZERO code to write.

查看更多
登录 后发表回答