Replace multiple instance of a character with a si

2019-09-13 00:42发布

问题:

Probably it's a duplicate, but I couldn't find a solution.

Requirement:

I have the below strings:

Heelloo
Heeelloo
Heeeelloo
Heeeeeelloo
Heeeeeeeelloo
.
.
.
Heeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeelloo

Expected output: Hello

What is the best way to achieve this in SQL?

Version I am using :

Microsoft SQL Server 2012 - 10.0.7365.0 (X64) Jul 28 2015 00:39:54 Copyright (c) 
Microsoft Corporation Parallel Data Warehouse (64-bit) on Windows NT 6.2 <X64> 
(Build 9200: )

回答1:

There is a nice trick for removing such duplicates for a single letter:

select replace(replace(replace(col, 'e', '<>'
                              ), '><', ''
                      ), '<>', 'e'
              )

This does require two characters ("<" and ">") that are not in the string (or more specifically, not in the string next to each other). The particular characters are not important.

How does this work?

Heeello
H<><><>llo
H<>llo
Hello


回答2:

Based on T-SQL String Manipulation Tips and Techniques, Part 1 especially part Replacing Multiple Contiguous Spaces With a Single Space and idea of Peter Larsson, a SQL Server MVP:

Then, the solution involves three steps (assuming the token is ~):

  1. Replace in @str each occurrence of ' ' (space) with '~ ' (token plus space).
  2. Replace in the result of the previous step each occurrence of ' ~' (space plus token) with '' (an empty string).
  3. Replace in the result of the previous step each occurrence of '~ ' (token plus space) with ' ' (space).
CREATE TABLE #tab(val NVARCHAR(100));

INSERT INTO #tab
SELECT 'Hello'
UNION ALL SELECT 'Heello'
UNION ALL SELECT 'Heeello'
UNION ALL SELECT 'Heeeello'
UNION ALL SELECT 'Heeeeeello'
UNION ALL SELECT 'Heeeeeeeello'
UNION ALL SELECT 'Heeeeeeeeeello';

-- version for one vowel(it can be enhanced to handle other if needed)
SELECT val,
cleaned = REPLACE(
           REPLACE(
            REPLACE(
            REPLACE(REPLACE(val, REPLICATE('e', 8), '^^')
              , 'e', '~ ')
            , ' ~', '')
          , '~ ', 'e')
          ,'^^','ee')              
FROM #tab;

LiveDemo

Output:

╔════════════════╦═════════╗
║      val       ║ cleaned ║
╠════════════════╬═════════╣
║ Hello          ║ Hello   ║
║ Heello         ║ Hello   ║
║ Heeello        ║ Hello   ║
║ Heeeello       ║ Hello   ║
║ Heeeeeello     ║ Hello   ║
║ Heeeeeeeello   ║ Heello  ║
║ Heeeeeeeeeello ║ Heeello ║
╚════════════════╩═════════╝


回答3:

Try this user defined function:

CREATE FUNCTION TrimDuplicates(@String varchar(max))
RETURNS varchar(max)
AS
BEGIN
    while CHARINDEX('ee',@String)>0 BEGIN SET @String=REPLACE(@String,'ee','e') END
    while CHARINDEX('oo',@String)>0 BEGIN SET @String=REPLACE(@String,'oo','o') END
    RETURN @String
END

Example Usage:

select dbo.TrimDuplicates ('Heeeeeeeelloo')

returns Hello