Is there any better way to remove all the special characters and numbers in a column, not limited to one or two almost to remove all the special characters and numbers.
As of now I I am using this query (Jst Logic part). Let me know is there any better choice as my table contains minimum 5 million records
Declare @name varchar(1000) = '231323Lig%$%$h$%t'
Declare @dumy varchar(1000)
while(PATINDEX('%[0-9]%',@name)<>0)
SET @name = stuff(@name,PATINDEX('%[0-9]%',@name),1,'')
while(PATINDEX('%[A-Z]%',@name)<>0)
begin
SET @dumy = isnull(@dumy,'')+substring(@name,PATINDEX('%[A-Z]%',@name),1)
SET @name = stuff(@name,PATINDEX('%[A-Z]%',@name),1,'')
end
Set @name = @dumy
Select @name 'Clean'
Try this function:
Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
Return @Temp
End
Call it like this:
Select dbo.RemoveNonAlphaCharacters('231323Lig%$%$h$%t')
Output:
Clean
Light
Update
If you want to do it for entire table just try like this
Select dbo.RemoveNonAlphaCharacters(ColumnName),OtherColumn1,OtherColumn2
FROM Table1
You can simplify your function to one WHILE loop:
DECLARE @String NVARCHAR(MAX) = '231323Lig%$%$h$%t'
DECLARE @Expression NVARCHAR(32) = '%[^A-Z]%'
WHILE PATINDEX(@Expression, @String) > 0
SET @String = STUFF(@String, PATINDEX(@Expression, @String), 1, '')
RETURN @String
A CLR
function could be faster than the pure T-SQL implementation.
Regex.Replace(str, "[^a-zA-Z]+", "", RegexOptions.Compiled)