In SSIS, I try to load data from a flat file.
The flat file have fixed width columns, but some column are not present in a row (a column can have a CRLF, which must be a new line) like this
a b c
the first rowok<CRLF>
iu jjrjdd<CRLF>
this is a newline<CRLF>
How I can have exactly the same number of line and exact data in my output?
I setup a flat file connection, of ragged right type.
In this sample, row 1 is correctly retrieve, but for row 2, it didn't recognize CRLF, and put in b column all the 3rd row.
Workaround
- In the flat file connection manager read the whole line as one Column (add only one column of type DT_STR and length 4000)
- Then in the dataflow task add a script component
- Add three output column (a,b,c) of type
DT_STR
- write a script that split each row and put values in columns (if one value is missed then null) (i used vb.net)
Tab delimited columns
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
Dim str() As String = Row.Column0.Split(CChar(vbTab))
If str.Length >= 3 Then
Row.a = str(0)
Row.b = str(1)
Row.c = str(2)
ElseIf str.Length = 2 Then
Row.a = str(0)
Row.b = str(1)
Row.c_IsNull = True
ElseIf str.Length = 1 Then
Row.a = str(0)
Row.b_IsNull = True
Row.c_IsNull = True
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
End Sub
Fixed width columns
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.Column0_IsNull AndAlso
Not String.IsNullOrEmpty(Row.Column0.Trim) Then
'Assuming that
'Col a => 0-5
'Col b => 5-15
'Col c => 15-
Dim intlength As Integer = Row.Column0.Length
If intlength <= 5 Then
Row.a = Row.Column0
Row.b_IsNull = True
Row.c_IsNull = True
ElseIf intlength > 5 AndAlso intlength <= 15 Then
Row.a = Row.Column0.Substring(0, 5)
Row.b = Row.Column0.Substring(5, 10)
Row.c_IsNull = True
ElseIf intlength > 15 Then
Row.a = Row.Column0.Substring(0, 5)
Row.b = Row.Column0.Substring(5, 10)
Row.c = Row.Column0.Substring(15)
End If
Else
Row.a_IsNull = True
Row.b_IsNull = True
Row.c_IsNull = True
End If
End Sub
You can also achieve this using derived column transformation