1

例如,我有一个小函数,它返回两个其他字符串之间的字符串(考虑在单引号、双引号甚至简单的 html 标记之间)。

        Dim exp As String = String.Format("{0}(.*?){1}", beginMarker, endMarker)

现在,如果我为 beginMarker 传入“<b>”,为结束标记传入“</b>”,并且我没有指定 RegEx.Ignore 大小写,它会正确返回匹配的小写 <b></b >。但是,一旦我指定 IgnoreCase,它就永远不会返回(假设输入相同)。这是一个示例函数(删除 RegexOptions.IgnoreCase 并且它可以工作)。此外,无论我是否逃避输入的标记,它似乎都不会改变输出,唯一的区别是 IgnoreCase:

我的问题是,我错过了什么(我使用了一个简单的例子,因为我实际上并没有用属性解析 HTML)?

输入:beginMarker = "<b>"
输入:endMarker = "</b>"
输入:searchText = "<b>这是一个测试</b>"
输入:beginMakers(没关系,真或假)

Public Shared Function GetStringInBetween(beginMarker As String, endMarker As String, searchText As String, includeMarkers As Boolean) As List(Of String)
    beginMarker = RegularExpressions.Regex.Escape(beginMarker)
    endMarker = RegularExpressions.Regex.Escape(endMarker)
    Dim exp As String = String.Format("{0}(.*?){1}", beginMarker, endMarker)
    Dim regEx As New RegularExpressions.Regex(exp)
    Dim returnList As New List(Of String)

    For Each m As Match In regEx.Matches(searchText, 0, RegexOptions.IgnoreCase)
        If includeMarkers = True Then
            returnList.Add(m.Value)
        Else
            returnList.Add(m.Value.TrimStart(beginMarker.ToCharArray).TrimEnd(endMarker.ToCharArray))
        End If
    Next

    Return returnList
End Function
4

1 回答 1

3

我不会使用 .NET 类名作为变量名,因为这可能会让人感到困惑。

这行得通,我更改了 Trim 函数,以便忽略大小写:

Imports System.Text.RegularExpressions

Module Module1

    Public Function GetStringInBetween(beginMarker As String, endMarker As String, searchText As String, includeMarkers As Boolean) As List(Of String)
        Dim exp As String = String.Format("{0}(.*?){1}", Regex.Escape(beginMarker), Regex.Escape(endMarker))
        Dim returnList As New List(Of String)

        For Each m As Match In Regex.Matches(searchText, exp, RegexOptions.IgnoreCase)
            If includeMarkers Then
                returnList.Add(m.Value)
            Else
                ' return the portion of the matched string without the markers
                returnList.Add(m.Value.Substring(beginMarker.Length, m.Value.Length - beginMarker.Length - endMarker.Length))
            End If
        Next

        Return returnList

    End Function

    Sub Main()
        ' include a \ to confirm the regex escaping 
        ' outputs: "hello, again"
        Console.WriteLine(String.Join(", ", GetStringInBetween("<x>", "</\x>", "<X>hello</\x> world <x>again</\x>", False).ToArray))
        Console.ReadLine()
    End Sub

End Module

编辑:哦,是的,也使用Option Strict On。并且没有将 (String, Int32, String) 作为参数的 RegEx.Matches 重载。

于 2012-07-23T18:15:17.287 回答