Remove Text from a String in SSIS derived column

2019-03-04 05:05发布

问题:

I am trying to remove some unwanted text values from inbound data. The received data looks like:

string;#0 Day(s), 08  Hours,  15  Minutes

I applied this in a derived column transformation :

SUBSTRING([Modified By], FINDSTRING([Modified By],"#",1) + 1, FINDSTRING([Modified By],"#", 1))

to get only whats after the #

The result is: 0 Day(s) but you see I'm missing the rest, I'm trying to get everything after # => 0 Day(s), 08 Hours, 15 Minutes

回答1:

I Think this expression will work properly .

Write this expression in your derived column expression .

SUBSTRING([Modified By],FINDSTRING([Modified By],"#",1) + 1,len([Modified By]))



回答2:

Given that you have text in the form of

string;#0 Day(s), 08  Hours,  15  Minutes

The desire is to remove everything from the # to the beginning of the string, end points inclusive, I would look at using the RIGHT operator. I want the rightmost characters starting at the position of the # and going to the end.

I find it helpful, especially for debugging, to break these sorts of things up into many steps.

DER Calculate # position

I create a derived column to calculate the position of the #. This is a one based ordinal system so I should get the value 8 based on the above. I add a new column in my Derived Column Component called SharpPosition data type of Int32

FINDSTRING([Modified By],"#",1)

Given a sampling of one, we don't know whether we need to account for NULLs in your source data or whether the # will always exist. Taking an approach like this is generally helpful as it allows you to identify the specific part of the expression that breaks.

DER Resulting string length

Here, I compute the resulting string length so the final operation is easier. Again, simple safety checks to ensure I have the correct values. New column, ResultingLength and the expression is

LEN([Modified By]) - [SharpPosition]

DER Right Modified By

Finally, we have the pieces of the puzzle we need to determine the positions within our original string that we need to modify.

I create a string column called ModInfo, length of 50 and used the following expression

RIGHT([Modified By], [ResultingLength])

You could roll all that up into a single expression but as I keep mentioning, the maintenance on this approach I find easier.

Biml

What's an answer without some code to reproduce the results, eh? If you don't already have the free extension, BIDS Helper installed, go install it.

Now that you have BIDS Helper installed, right click on your project and select Add new Biml file.

In the resulting BimlScript.biml file, paste the following.

Adjust line 3 to point to an instance of SQL Server. Depending on your version, you may also need to update the provider from SQLNCLI11.1 to match your instance.

Once done, right click on the biml file and select Generate SSIS Package. Out pops a shiny new SSIS package that has all the correct components and expressions.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <Connection Name="tempdb" ConnectionString="Data Source=localhost\dev2012;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;" />
    </Connections>
    <Packages>
        <Package Name="so_33020866">
            <Tasks>
                <Dataflow Name="DFT Demo">
                    <Transformations>
                        <OleDbSource ConnectionName="tempdb" Name="OLE_SRC Demo">
                            <DirectInput>SELECT 'string;#0 Day(s), 08  Hours,  15  Minutes' AS [Modified By] UNION ALL SELECT NULL UNION ALL SELECT ''</DirectInput>
                        </OleDbSource>
                        <DerivedColumns Name="DER Calculate # position">
                            <Columns>
                                <Column DataType="Int32" Name="SharpPosition">FINDSTRING([Modified By],"#",1)</Column>
                            </Columns>
                        </DerivedColumns>
                        <DerivedColumns Name="DER Resulting String Length">
                            <Columns>
                                <Column DataType="Int32" Name="ResultingLength">LEN([Modified By]) - [SharpPosition]</Column>
                            </Columns>
                        </DerivedColumns>
                        <DerivedColumns Name="DER Right Modified By">
                            <Columns>
                                <Column DataType="String" Name="ModInfo" Length="50">RIGHT([Modified By], [ResultingLength])</Column>
                            </Columns>
                        </DerivedColumns>
                        <DerivedColumns Name="DER PlaceHolder"></DerivedColumns>
                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Results

You can see I simulated with your source string, a string that has no # and a NULL. None of them fail.



回答3:

I think, this is what you want:

SUBSTRING([Modified By],FINDSTRING([Modified By],"#",1) + 1,LEN([Modified By]) - (FINDSTRING([Modified By],"#",1) + 1) )


标签: ssis