Split sql string into words

2019-02-20 22:43发布

问题:

I want to split string into words like below, the output of all the string should be same:

INPUT:

 1. This is a string
 2. This    is   a   string
 3. This  is a          string
 4. This  is           a string

OUTPUT:

This is a

Means, that I want first three words from the sentence, irrespective of the spaces.

回答1:

Try this:

declare @s1 varchar(3000) ;
declare @xml xml,@str varchar(100),@delimiter varchar(10), @out varchar(max);;
select @delimiter =' '
select @s1 =  'This is a string';
select @s1 = 'This    is   a   string ';
select @s1 = 'This  is a          string ';
select @s1 = 'This  is           a string';

select @xml = cast(('<X>'+replace(@s1,@delimiter ,'</X><X>')+'</X>') as xml)

select top 3 @out = 
    COALESCE(@out + ' ', '') +  C.value('.', 'varchar(100)') 
from @xml.nodes('X') as X(C)
where LEN(C.value('.', 'varchar(10)')) > 0

select @out


回答2:

Now your case contains two steps:

1. Removing additional spaces and converting them to single space. You can use REPLACE() method to this.

SELECT REPLACE(REPLACE(REPLACE("This    is   a   string",' ','<>'),'><',''),'<>',' ')

Process: The innermost REPLACE changes all blanks to a less-than greater-than pair. If there are three spaces between This and is, the innermost REPLACE returns This<><><>is. The middle REPLACE changes all greater-than less-than pairs to the empty string, which removes them. The<><><>is becomes The<>is.

The outer REPLACE changes all less-than greater-than pairs to a single blank. The<>is becomes The is.

Now all the sentences are normalized with one space.

2. Split the words and get the three words.

There are lot of Stackoverflow question which discusses them. I liked the Common Table Expression to split the string : How do I split a string so I can access item x?

Let me know if you require any help in the splitting the words.



回答3:

Create a Tally Table:

SELECT TOP 11000 
    IDENTITY( INT,1,1 ) AS Num
INTO    dbo.Tally
FROM    Master.dbo.SysColumns sc1,
    Master.dbo.SysColumns sc2
GO

Create a Table Valued Function:

CREATE FUNCTION dbo.[fnSetSplit]
   (
     @String VARCHAR(8000),
     @Delimiter CHAR(1)
   )
RETURNS TABLE
AS
RETURN
   ( SELECT Num,
            SUBSTRING(@String, CASE Num
                                 WHEN 1 THEN 1
                                 ELSE Num + 1
                               END,
                      CASE CHARINDEX(@Delimiter, @String,
                                     Num + 1)
                        WHEN 0
                        THEN LEN(@String) - Num + 1
                        ELSE CHARINDEX(@Delimiter,
                                       @String, Num + 1)
                             - Num
                             - CASE WHEN Num > 1 THEN 1
                                    ELSE 0
                               END
                      END) AS String
     FROM   dbo.Tally
     WHERE  Num <= LEN(@String)
            AND ( SUBSTRING(@String, Num, 1) = @Delimiter
                  OR Num = 1 )
   )

Query function:

SELECT TOP 3
        fss.String
FROM    dbo.fnSetSplit('This  is           a string', ' ') fss
WHERE   NOT ( fss.String = '' )

If you need to reconcatenate, look at string concatenation using FOR XML (PATH)



回答4:

SQL Server 2016 (compatibility level 130) allows to use STRING_SPLIT function:

DECLARE @delimiter varchar(10) = ' '

SELECT STRING_AGG(value, @delimiter) 
    FROM  (SELECT TOP 3 value FROM STRING_SPLIT('This is a string', @delimiter) WHERE LEN(value)>0) inq

SELECT STRING_AGG(value, @delimiter) 
    FROM (SELECT TOP 3 value FROM STRING_SPLIT('This  is a          string ', @delimiter) WHERE LEN(value)>0) inq

SELECT STRING_AGG(value, @delimiter) 
    FROM (SELECT TOP 3 value FROM STRING_SPLIT('This  is           a string', @delimiter) WHERE LEN(value)>0) inq

SELECT STRING_AGG(value, @delimiter) 
    FROM (SELECT TOP 3 value FROM STRING_SPLIT('This      is a    string', @delimiter) WHERE LEN(value)>0) inq

Result:

This is a
This is a
This is a
This is a