经过额外的工作,我们决定不使用 Art 的答案中的方法(即使它有效)。
我们需要一种更强大的方法来验证和提取子字符串,所以我通过 CLR 路线使用了正则表达式(感谢Pondlife为我指明了正确的方向)。
我采取的方法如下:
首先,我编译了以下 CLR:(从 C# 示例转换为 VB Here)
Imports System.Data
Imports System.Data.SqlClient
Imports System.Data.SqlTypes
Imports Microsoft.SqlServer.Server
Imports System.Text.RegularExpressions
Imports System.Text
Partial Public Class UserDefinedFunctions
Public Shared ReadOnly Options As RegexOptions = RegexOptions.IgnorePatternWhitespace Or RegexOptions.Multiline
<SqlFunction()> _
Public Shared Function RegexMatch(ByVal input As SqlChars, ByVal pattern As SqlString) As SqlBoolean
Dim regex As New Regex(pattern.Value, Options)
Return regex.IsMatch(New String(input.Value))
End Function
<SqlFunction()> _
Public Shared Function RegexReplace(ByVal expression As SqlString, ByVal pattern As SqlString, ByVal replace As SqlString) As SqlString
If expression.IsNull OrElse pattern.IsNull OrElse replace.IsNull Then
Return SqlString.Null
End If
Dim r As New Regex(pattern.ToString())
Return New SqlString(r.Replace(expression.ToString(), replace.ToString()))
End Function
' returns the matching string. Results are separated by 3rd parameter
<SqlFunction()> _
Public Shared Function RegexSelectAll(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal matchDelimiter As SqlString) As SqlString
Dim regex As New Regex(pattern.Value, Options)
Dim results As Match = regex.Match(New String(input.Value))
Dim sb As New StringBuilder()
While results.Success
sb.Append(results.Value)
results = results.NextMatch()
' separate the results with newline|newline
If results.Success Then
sb.Append(matchDelimiter.Value)
End If
End While
Return New SqlString(sb.ToString())
End Function
' returns the matching string
' matchIndex is the zero-based index of the results. 0 for the 1st match, 1, for 2nd match, etc
<SqlFunction()> _
Public Shared Function RegexSelectOne(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal matchIndex As SqlInt32) As SqlString
Dim regex As New Regex(pattern.Value, Options)
Dim results As Match = regex.Match(New String(input.Value))
Dim resultStr As String = ""
Dim index As Integer = 0
While results.Success
If index = matchIndex Then
resultStr = results.Value.ToString()
End If
results = results.NextMatch()
index += 1
End While
Return New SqlString(resultStr)
End Function
End Class
我按如下方式安装了这个 CLR:
EXEC sp_configure
'clr enabled' ,
'1'
GO
RECONFIGURE
USE [db_Utility]
GO
CREATE ASSEMBLY SQL_CLR_RegExp FROM 'D:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Binn\SQL_CLR_RegExp.dll' WITH
PERMISSION_SET = SAFE
GO
-- =============================================
-- Returns 1 or 0 if input matches pattern
-- VB function: RegexMatch(ByVal input As SqlChars, ByVal pattern As SqlString) As SqlBoolean
-- =============================================
CREATE FUNCTION [dbo].[RegexMatch]
(
@input [nvarchar](MAX) ,
@pattern [nvarchar](MAX)
)
RETURNS [bit]
WITH EXECUTE AS CALLER
AS EXTERNAL NAME
[SQL_CLR_RegExp].[SQL_CLR_RegExp.UserDefinedFunctions].[RegexMatch]
GO
-- =============================================
-- Returns a comma separated string of found objects
-- VB function: RegexReplace(ByVal expression As SqlString, ByVal pattern As SqlString, ByVal replace As SqlString) As SqlString
-- =============================================
CREATE FUNCTION [dbo].[RegexReplace]
(
@expression [nvarchar](MAX) ,
@pattern [nvarchar](MAX) ,
@replace [nvarchar](MAX)
)
RETURNS [nvarchar](MAX)
WITH EXECUTE AS CALLER
AS EXTERNAL NAME
[SQL_CLR_RegExp].[SQL_CLR_RegExp.UserDefinedFunctions].[RegexReplace]
GO
-- =============================================
-- Returns a comma separated string of found objects
-- VB function: RegexSelectAll(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal matchDelimiter As SqlString) As SqlString
-- =============================================
CREATE FUNCTION [dbo].[RegexSelectAll]
(
@input [nvarchar](MAX) ,
@pattern [nvarchar](MAX) ,
@matchDelimiter [nvarchar](MAX)
)
RETURNS [nvarchar](MAX)
WITH EXECUTE AS CALLER
AS EXTERNAL NAME
[SQL_CLR_RegExp].[SQL_CLR_RegExp.UserDefinedFunctions].[RegexSelectAll]
GO
-- =============================================
-- Returns finding matchIndex of a zero based index
-- RegexSelectOne(ByVal input As SqlChars, ByVal pattern As SqlString, ByVal matchIndex As SqlInt32) As SqlString
-- =============================================
CREATE FUNCTION [dbo].[RegexSelectOne]
(
@input [nvarchar](MAX) ,
@pattern [nvarchar](MAX) ,
@matchIndex [int]
)
RETURNS [nvarchar](MAX)
WITH EXECUTE AS CALLER
AS EXTERNAL NAME
[SQL_CLR_RegExp].[SQL_CLR_RegExp.UserDefinedFunctions].[RegexSelectOne]
GO
然后我编写了以下包装函数来简化使用:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: <Jordon Pilling>
-- Create date: <30/01/2013>
-- Description: <Calls RegexSelectOne with start and end text and cleans the result>
-- =============================================
CREATE FUNCTION [dbo].[RegexSelectOneWithScrub]
(
@Haystack VARCHAR(MAX),
@StartNeedle VARCHAR(MAX),
@EndNeedle VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @ReturnStr VARCHAR(MAX)
--#### Extract text from HayStack using Start and End Needles
SET @ReturnStr = dbo.RegexSelectOne(@Haystack, REPLACE(@StartNeedle, ' ','\s') + '((.|\n)+?)' + REPLACE(@EndNeedle, ' ','\s'), 0)
--#### Remove the Needles
SET @ReturnStr = REPLACE(@ReturnStr, @StartNeedle, '')
SET @ReturnStr = REPLACE(@ReturnStr, @EndNeedle, '')
--#### Trim White Space
SET @ReturnStr = LTRIM(RTRIM(@ReturnStr))
--#### Trim Line Breaks and Carriage Returns
SET @ReturnStr = dbo.SuperTrim(@ReturnStr)
RETURN @ReturnStr
END
GO
这允许使用如下:
DECLARE @Subject VARCHAR(250) = 'HelpDesk Call Reference F0012345, Call Update, 40111'
DECLARE @Ref VARCHAR(250) = NULL
IF dbo.RegexMatch(@Subject, '^HelpDesk\sCall\sReference\sF[0-9]{7},\s(Call\sResolved|Call\sUpdate|New\scall\slogged),(|\s+)([0-9]+|unknown)$') = 1
SET @Ref = ISNULL(dbo.RegexSelectOneWithScrub(@Subject, 'HelpDesk Call Reference', ','), 'Invalid (#1)')
ELSE
SET @Ref = 'Invalid (#2)'
SELECT @Ref
这在用于多个搜索时要快得多,并且在处理具有不同开头和结尾短语等的大量文本时更强大。