Read char,double,int pattern from string in sql

2019-04-09 14:15发布

问题:

You have a string like

set @string = 'ddd,1.5,1,eee,2.3,0,fff,1.2,ggg,6.123,1'

I would like to know if there is a way to extract the string values and place them in the first row,the double values and place them in the second row and the int values and place them in the third row.The string logic is like this

"string,double,int,string,double,int..."

but there are cases when there is

"string,double,int,string,double,string,double,int"

and I would like in the third row where the int should be to be 1 by default so the table would look something like this.

First Row   Second Row  Third Row
ddd           1.5         1
eee           2.3         0
fff           1.2         1
ggg           6.123       1

I have a code where u can extract all the values from the string and place them in a row but that is not enough.

declare @string as nvarchar(MAX)

set @string = 'aaa,bbb,ccc,ddd,1.5,1,eee,2.3,1,fff,1.2,ggg,6.123,1'
;with tmp(DataItem, Data) 
as (
select LEFT(@string, CHARINDEX(',',@string+',')-1),
    STUFF(@string, 1, CHARINDEX(',',@string+','), '')
union all
select LEFT(Data, CHARINDEX(',',Data+',')-1),
    STUFF(Data, 1, CHARINDEX(',',Data+','), '')
from tmp
where Data > '')
select DataItem from tmp
option (maxrecursion 0)

回答1:

Final version (I hope):

Since sql server 2008 doesn't support order by in the over clause of aggregate functions, I've added another cte to add the row index instead of the sum I've used in the previous version:

;WITH cteAllRows as
(
     SELECT Item, 
            ItemIndex, 
            CASE WHEN ISNUMERIC(Item) = 0 THEN 'String'
            WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) > 0 THEN 'Double'
            WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) = 0 THEN 'Integer'
            END As DataType
     FROM dbo.SplitStrings_Numbers(@string, ',')
), cteAll as
(
    SELECT  Item, 
            DataType, 
            ItemIndex, 
            (
                SELECT COUNT(*)
                FROM cteAllRows tInner
                WHERE tInner.DataType = 'String'
                AND tInner.ItemIndex <= tOuter.ItemIndex
            ) As RowIndex
    FROM cteAllRows tOuter
)

All the rest is the same as the previous version.

Update

The first thing I've done is to change the string split function to a function based on a tally table, so that I can easily add the row number to it. So, if you don't already have a tally table, create one. If you are asking your self what is a tally table and why do you need it, read this article by Jeff Moden:

SELECT TOP 10000 IDENTITY(int,1,1) AS Number
    INTO Tally
    FROM sys.objects s1       
    CROSS JOIN sys.objects s2 
ALTER TABLE Tally ADD CONSTRAINT PK_NumbersTest PRIMARY KEY CLUSTERED (Number)
GO

Then, create string split function based on the tally table (taken from Aaron's article but added the row index column):

CREATE FUNCTION dbo.SplitStrings_Numbers
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN
   (
       SELECT   Item = SUBSTRING(@List, Number, CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number),
                ROW_NUMBER() OVER (ORDER BY Number) As ItemIndex
       FROM dbo.Tally
       WHERE Number <= CONVERT(INT, LEN(@List))
         AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
   );
GO

Now, The trick I've used is very much like the previous one, only now I've added to the first cte a new column I've called RowIndex, that's basically a running total of the count of strings, based on the row index of all rows:

 SELECT Item, 
        CASE WHEN ISNUMERIC(Item) = 0 THEN 'String'
        WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) > 0 THEN 'Double'
        WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) = 0 THEN 'Integer'
        END As DataType,
        SUM(CASE WHEN ISNUMERIC(Item) = 0 THEN 1 END) OVER(ORDER BY ItemIndex) As RowIndex
 FROM dbo.SplitStrings_Numbers(@string, ',')

It gave me this result:

Item       DataType RowIndex
---------- -------- -----------
ddd        String   1
1.5        Double   1
1          Integer  1
eee        String   2
2.3        Double   2
0          Integer  2
fff        String   3
1.2        Double   3
ggg        String   4
6.123      Double   4
1          Integer  4

As you can see, I now have a number for each row, so from now on it's simple:

;WITH cteAll as
(
     SELECT Item, 
            CASE WHEN ISNUMERIC(Item) = 0 THEN 'String'
            WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) > 0 THEN 'Double'
            WHEN ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) = 0 THEN 'Integer'
            END As DataType,
            SUM(CASE WHEN ISNUMERIC(Item) = 0 THEN 1 END) OVER(ORDER BY ItemIndex) As RowIndex
     FROM dbo.SplitStrings_Numbers(@string, ',')
), cteString AS
(
    SELECT Item, RowIndex
    FROM cteAll
    WHERE DataType = 'String'
), cteDouble AS
(
    SELECT Item, RowIndex
    FROM cteAll
    WHERE DataType = 'Double'
), cteInteger AS
(
    SELECT Item, RowIndex
    FROM cteAll
    WHERE DataType = 'Integer'
)

SELECT  T1.Item As [String],
        T2.Item As [Double],
        T3.Item As [Integer]
FROM dbo.Tally 
LEFT JOIN cteString T1 ON T1.RowIndex = Number 
LEFT JOIN cteDouble T2 ON t2.RowIndex = Number 
LEFT JOIN cteInteger T3 ON t3.RowIndex = Number
WHERE COALESCE(T1.Item, T2.Item, T3.Item) IS NOT NULL

That gave me this result:

String     Double     Integer
---------- ---------- ----------
ddd        1.5        1
eee        2.3        0
fff        1.2        NULL
ggg        6.123      1

As you can see, the items are now sorted by the original order in the string. Thanks for the challenge, It's been a while since I've had a decent one :-)

First attempt

Well, first you have to split that string into a table. To do that you should use a user defined function. You can pick the one best suited for you from Aaron Bertrand's Split strings the right way – or the next best way article.

For this demonstration I've chosen to use the SplitStrings_XML.

So first, create the function:

CREATE FUNCTION dbo.SplitStrings_XML
(
   @List       NVARCHAR(MAX),
   @Delimiter  NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
   RETURN 
   (  
      SELECT Item = y.i.value('(./text())[1]', 'nvarchar(4000)')
      FROM 
      ( 
        SELECT x = CONVERT(XML, '<i>' 
          + REPLACE(@List, @Delimiter, '</i><i>') 
          + '</i>').query('.')
      ) AS a CROSS APPLY x.nodes('i') AS y(i)
   );
GO

Now, declare and initialize the variable:

declare @string nvarchar(max) = 'ddd,1.5,1,eee,2.3,0,fff,1.2,ggg,6.123,1'

Then, Create 4 common table expressions - one for all items, one for strings, one for doubles and one for integers. Note the use of the row_number() function - it will be used later to join all the results together:

;WITH AllItems as
(
    SELECT Item, ROW_NUMBER() OVER(ORDER BY (select null)) as rn
    FROM dbo.SplitStrings_XML(@string, ',')
)

, Strings as
(
    SELECT Item as StringItem, ROW_NUMBER() OVER(ORDER BY (select null))  as rn
    FROM dbo.SplitStrings_XML(@string, ',')
    WHERE ISNUMERIC(Item) = 0
), Doubles as 
(
    SELECT Item as DoubleItem, ROW_NUMBER() OVER(ORDER BY (select null))  as rn
    FROM dbo.SplitStrings_XML(@string, ',')
    WHERE ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) > 0
), Integers as
(
    SELECT Item as IntegerItem, ROW_NUMBER() OVER(ORDER BY (select null))  as rn
    FROM dbo.SplitStrings_XML(@string, ',')
    WHERE ISNUMERIC(Item) = 1 AND CHARINDEX('.', Item) = 0 
)

Then, select from joining all these common table expressions. Note the use of the COALESCE built in function to only return rows where at least one value is present:

SELECT StringItem,  DoubleItem, IntegerItem
FROM AllItems A
LEFT JOIN Strings S ON A.rn = S.rn
LEFT JOIN Doubles D ON A.rn = D.rn
LEFT JOIN Integers I ON A.rn = I.rn
WHERE COALESCE(StringItem,  DoubleItem, IntegerItem) IS NOT NULL

Results:

StringItem  DoubleItem  IntegerItem
----------  ----------  -----------
ddd         1.5         1
eee         2.3         0
fff         1.2         1
ggg         6.123       NULL


回答2:

this is your solution, just a bit more completed:

declare @string as nvarchar(MAX)
declare @id int=0
set @string = 'aaa,bbb,ccc,ddd,1.5,1,eee,2.3,1,fff,1.2,ggg,6.123,1'
;with tmp( id,[type],DataItem, Data) 
as (
select 
id=row_number() over(order by @string), 'string',
LEFT(@string, CHARINDEX(',',@string+',')-1),
    STUFF(@string, 1, CHARINDEX(',',@string+','), '')
union all
select 
        case when LEFT(Data, CHARINDEX(',',Data+',')-1) like '%[a-Z]%'  then id+1
             when LEFT(Data, CHARINDEX(',',Data+',')-1) like '%[0-9]%'  then id 
        end,
        case when LEFT(Data, CHARINDEX(',',Data+',')-1) like '%[a-Z]%'  then 'string'
             when LEFT(Data, CHARINDEX(',',Data+',')-1) like '%[0-9]%' and LEFT(Data, CHARINDEX(',',Data+',')-1) not like '%.%' then 'int' 
             when LEFT(Data, CHARINDEX(',',Data+',')-1) like '%[0-9]%' and LEFT(Data, CHARINDEX(',',Data+',')-1) like '%.%' then 'double' 
        end,
        LEFT(Data, CHARINDEX(',',Data+',')-1) as dataItem,
        STUFF(Data, 1, CHARINDEX(',',Data+','), '')
from tmp
where Data > ''
)
select  
        id,
        min(case [type]  when 'string' then DataItem end) as 'String',
        min(case [type]  when 'int' then DataItem end) as 'Int',
        min(case [type]  when 'double' then DataItem end) as 'Double'
from tmp 
group by id
option (maxrecursion 0)