SSIS Combine multiple rows into single row

2019-08-14 19:29发布

问题:

I have a flat file that has 6 columns: NoteID, Sequence, FileNumber, EntryDte, NoteType, and NoteText. The NoteText column has 200 characters and if a note is longer than 200 characters then a second row in the file contains the continuation of the note. It looks something like this:

|NoteID |  Sequence |   NoteText              |
 ---------------------------------------------
|1234   |    1      |   start of note text... |

|1234   |    2      |   continue of note....  |

|1234   |    3      |   more continuation of first note... |

|1235   |    1      |   start of new note.... |

How can I in SSIS combine the multiple rows of NoteText into one row so the row would like this:

| NoteID      |  Sequence |   NoteText              |
 ---------------------------------------------------
|1234         |    1      |   start of note text... continue of note... more continuation of first note... |

|1235         |    1      |   start of new note.... |

Greatly appreciate any help?

Update: Changing the SynchronousInputID to None exposed the Output0Buffer and I was able to use it. Below is what I have in place now.

Dim NoteID As String = "-1"
Dim NoteString As String = ""
Dim IsFirstRow As Boolean = True
Dim NoteBlob As Byte()
Dim enc As New System.Text.ASCIIEncoding()
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    If Row.NoteID.ToString() = NoteID Then
        NoteString += Row.NoteHTML
        IsFirstRow = True
    Else
        If IsFirstRow Then
            Output0Buffer.AddRow()
            IsFirstRow = False
        End If
        NoteID = Row.NoteID.ToString()
        NoteString = Row.NoteHTML.ToString()
    End If
    NoteBlob = enc.GetBytes(NoteString)
    Output0Buffer.SingleNoteHTML.AddBlobData(NoteBlob)
    Output0Buffer.ClaimID = Row.ClaimID
    Output0Buffer.UserID = Row.UserID
    Output0Buffer.NoteTypeLookupID = Row.NoteTypeLookupID
    Output0Buffer.DateCreatedUTC = Row.DateCreated
    Output0Buffer.ActivityDateUTC = Row.ActivityDate
    Output0Buffer.IsPublic = Row.IsPublic
End Sub

My problem now is that I had to convert the output column from Wstr(4000) to NText because some of the notes are so long. When it imports into my SQL table, it is just jibberish characters and not the actual notes.

回答1:

In SQL Server Management Studio (using SQL), you could easily combine your NoteText field using stuff function with XML Path to combine your row values to a single column like this:

select distinct 
        noteid,
        min(sequence) over (partition by n.noteid order by n.sequence) as sequence,
        stuff((select ' ' + NoteText
                      from notes n1
                      where n.noteid = n1.noteid
                      for xml path ('')
                      ),1,1,'') as NoteText
from notes n;

You will probably want to look into something along the line that does similar thing in SSIS. Check out this link on how to create a script component in SSIS to do something similar: SSIS Script Component - concat rows

SQL Fiddle Demo