Is it possible to make efficient queries that use regular expression feature set.I got data in my table which is not in correct format,EX:-In Title colum: Cable 180┬░ To 90┬░ Serial ATA Cable and in Id column 123234+ data in exponential format,it is possible to make queries using regular expression in Sqlserver2008.
问题:
回答1:
You need to make use of the following. Usually requires combinations of the three:
patindex
charindex
substring
In response to your comment above, patindex
should not 0 where the case is found. patindex
finds the start location of the pattern specified, so if patindex
finds the case, it should return an integer > 0.
EDIT:
Also, len(string)
and reverse(string)
come in handy on specific occasions.
回答2:
With the CLR and .NET project published to SQL Server it is EXTREMELY efficient. After starting to use a CLR Project in VB.Net with our 2005 SQL Server over the past 2 years I have found that every occurance of a Scalar Function in TSQL for which I have replaced with the .NET version it have dramatically improved performance times. I have used it for advanced date manipulation, formatting and parsing, String formatting and parsing, MD5 Hash generation, Vector lengths, String JOIN Aggragate function, Split Table Valued function, and even bulk loading from serialized datatables via a share folder (which is amazingly fast).
For RegEx since it is not already present I can only assume it is as efficient as a compiled EXE would be doing the same REGEX, which is to say extremely fast.
I will share a code file from my VB.Net CLR project that allows some RegEx functionality. This code would be part of a .NET CLR DLL that is published to your server.
Function Summary
Regex_IsMatch(Data,Parttern,Options) AS tinyint (0/1 result)
Eg. SELECT dbo.Regex_IsMatch('Darren','[trwq]en$',NULL) -- returns 1 / true
Regex_Group(data,pattern,groupname,options) as nvarchar(max) (capture group value returned)
Eg. SELECT dbo.Regex_Group('Cable 180+e10 to 120+e3',' (?[0-9]+)+e[0-9]+','n',NULL) -- returns '180'
Regex_Replace(data,pattern,replacement,options) as nvarchar(max) (returns modified string)
Eg. SELECT dbo.Regex_Replace('Cable 180+e10 to 120+e3',' (?[0-9]+)+e(?[0-9]+)',' ${e}:${n]',NULL) -- returns 'Cable 10:180 to 3:120'
Partial Public Class UserDefinedFunctions
''' <summary>
''' Returns 1 (true) or 0 (false) if a pattern passed is matched in the data passed.
''' Returns NULL if Data is NULL.
''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
''' </summary>
''' <param name="data"></param>
''' <param name="pattern"></param>
''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
''' <returns></returns>
''' <remarks></remarks>
<Microsoft.SqlServer.Server.SqlFunction()> _
Public Shared Function Regex_IsMatch(data As SqlChars, pattern As SqlChars, options As SqlString) As SqlByte
If pattern.IsNull Then
Throw New Exception("Pattern Parameter in ""RegEx_IsMatch"" cannot be NULL")
End If
If data.IsNull Then
Return SqlByte.Null
Else
Return CByte(If(Regex.IsMatch(data.Value, pattern.Value, Regex_Options(options)), 1, 0))
End If
End Function
''' <summary>
''' Returns the Value of a RegularExpression Pattern Group by Name or Number.
''' Group needs to be captured explicitly. Example Pattern "[a-z](?<m>[0-9][0-9][0-9][0-9])" to capture the numeric portion of an engeneering number by the group called "m".
''' Returns NULL if The Capture was not successful.
''' Returns NULL if Data is NULL.
''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
''' </summary>
''' <param name="data"></param>
''' <param name="pattern"></param>
''' <param name="groupName">Name used in the explicit capture group</param>
''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
<Microsoft.SqlServer.Server.SqlFunction()> _
Public Shared Function Regex_Group(data As SqlChars, pattern As SqlChars, groupName As SqlString, options As SqlString) As SqlChars
If pattern.IsNull Then
Throw New Exception("Pattern Parameter in ""RegEx_IsMatch"" cannot be NULL")
End If
If groupName.IsNull Then
Throw New Exception("GroupName Parameter in ""RegEx_IsMatch"" cannot be NULL")
End If
If data.IsNull Then
Return SqlChars.Null
Else
Dim m As Match = Regex.Match(data.Value, pattern.Value, Regex_Options(options))
If m.Success Then
Dim g As Group
If IsNumeric(groupName.Value) Then
g = m.Groups(CInt(groupName.Value))
Else
g = m.Groups(groupName.Value)
End If
If g.Success Then
Return New SqlChars(g.Value)
Else ' group did not return or was not found.
Return SqlChars.Null
End If
Else 'match failed.
Return SqlChars.Null
End If
End If
End Function
''' <summary>
''' Does the Equivalent toi Regex.Replace in .NET.
''' Replacement String Replacement Markers are done in this format "${test}" = Replaces the capturing group (?<test>...)
''' If the replacement pattern is $1 or $2 then it replaces the first or second captured group by position.
''' Returns NULL if Data is NULL.
''' options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"
''' </summary>
''' <param name="data"></param>
''' <param name="pattern"></param>
''' <param name="replacement">Replacement String Replacement Markers are done in this format "${test}" = Replaces the capturing group (?<test>...). If the replacement pattern is $1 or $2 then it replaces the first or second captured group by position.</param>
''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
''' <returns></returns>
''' <remarks></remarks>
<SqlFunction()> _
Public Shared Function Regex_Replace(data As SqlChars, pattern As SqlChars, replacement As SqlChars, options As SqlString) As SqlChars
If pattern.IsNull Then
Throw New Exception("Pattern Parameter in ""Regex_Replace"" cannot be NULL")
End If
If replacement.IsNull Then
Throw New Exception("Replacement Parameter in ""Regex_Replace"" cannot be NULL")
End If
If data.IsNull Then
Return SqlChars.Null
Else
Return New SqlChars(Regex.Replace(data.Value, pattern.Value, replacement.Value, Regex_Options(options)))
End If
End Function
''' <summary>
''' Buffered list of options by name for speed.
''' </summary>
Private Shared m_Regex_Buffered_Options As New Generic.Dictionary(Of String, RegexOptions)(StrComp)
''' <summary>
''' Default regex options used when options value is NULL or an Empty String
''' </summary>
Private Shared ReadOnly m_Regex_DefaultOptions As RegexOptions = RegexOptions.IgnoreCase Or RegexOptions.ExplicitCapture Or RegexOptions.Multiline
''' <summary>
''' Get the regular expressions options to use by a passed string of data.
''' Formatted like command line arguments.
''' </summary>
''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline "</param>
Private Shared Function Regex_Options(options As SqlString) As RegexOptions
Return Regex_Options(If(options.IsNull, "", options.Value))
End Function
''' <summary>
''' Get the regular expressions options to use by a passed string of data.
''' Formatted like command line arguments.
''' </summary>
''' <param name="options">options example, full or partial names can be used after slashes or hypens with or without spaces, some are exclusive of each other "/ic /ex -s" = "\ignorecase -explicitcapture/singleline"</param>
Private Shared Function Regex_Options(options As String) As RegexOptions
'empty options string is considered default options.
If options Is Nothing OrElse options = "" Then
Return m_Regex_DefaultOptions
Else
Dim out As RegexOptions
If m_Regex_Buffered_Options.TryGetValue(options, out) Then
Return out
Else
'must build options and store them
If options Like "*[/\-]n*" Then
out = RegexOptions.None
End If
If options Like "*[/\-]s*" Then
out = out Or RegexOptions.Singleline
End If
If options Like "*[/\-]m*" Then
out = out Or RegexOptions.Multiline
End If
If options Like "*[/\-]co*" Then
out = out Or RegexOptions.Compiled
End If
If options Like "*[/\-]c[ui]*" Then
out = out Or RegexOptions.CultureInvariant
End If
If options Like "*[/\-]ecma*" Then
out = out Or RegexOptions.ECMAScript
End If
If options Like "*[/\-]e[xc]*" Then
out = out Or RegexOptions.ExplicitCapture
End If
If options Like "*[/\-]i[c]*" OrElse options Like "*[/\-]ignorec*" Then
out = out Or RegexOptions.IgnoreCase
End If
If options Like "*[/\-]i[pw]*" OrElse options Like "*[/\-]ignore[pw]*" Then
out = out Or RegexOptions.IgnorePatternWhitespace
End If
If options Like "*[/\-]r[tl]*" Then
out = out Or RegexOptions.RightToLeft
End If
'store the options for next call (for speed)
m_Regex_Buffered_Options(options) = out
Return out
End If
End If
End Function
End Class