0

我有一个高亮算法,它接受一个字符串并在其中的匹配项周围添加高亮代码。我遇到的问题是将“Find tæst”之类的词作为要搜索的字符串,将“taest”作为要查找的字符串。由于搜索字符串的长度与匹配的长度不匹配,我无法准确找到匹配的结尾。在我的情况下,IndexOf 向我展示了匹配,但由于组合的 æ 被计为一个字符,因此我无法检测到匹配的结束。我不认为 IndexOf 在这里对我有用。返回匹配索引和匹配长度的东西会起作用。但我不知道还能用什么。

    ' cycle through search words and replace them in the text
    For intWord = LBound(m_arrSearchWords) To UBound(m_arrSearchWords)

       If m_arrSearchWords(intWord).Length > 0 Then

          ' replace instances of the word with the word surrounded by bold codes

          ' find starting position
          intPos = strText.IndexOf(m_arrSearchWords(intWord), System.StringComparison.CurrentCultureIgnoreCase)
          Do While intPos <> -1

             strText = strText.Substring(0, (intPos - 1) - 0 + 1) & cstrHighlightCodeOn & strText.Substring(intPos, m_arrSearchWords(intWord).Length) & cstrHighlightCodeOff & strText.Substring(intPos + m_arrSearchWords(intWord).Length)
             intPos = strText.IndexOf(m_arrSearchWords(intWord), intPos + m_arrSearchWords(intWord).Length + cstrHighlightCodeOn.Length + cstrHighlightCodeOff.Length, System.StringComparison.CurrentCultureIgnoreCase)

          Loop

       End If

    Next intWord

Substring 方法失败,因为长度超出了字符串的末尾。我对以搜索词结尾的字符串进行了修复(上面未显示)。但是较长的字符串会错误地突出显示,我需要修复这些。

4

2 回答 2

0

虽然 IndexOf 返回匹配长度会很好,但事实证明您可以自己进行比较以找出答案。我只是对长度进行二次比较以找到最大的匹配。我从搜索词的长度开始,它应该是最大的。然后向后工作以找到长度。一旦我找到了我使用的长度。如果我没有找到它,我会努力增加长度。如果我正在搜索的字符串较大或较小,这将起作用。这意味着在正常情况下至少有一个额外的比较,在最坏的情况下,根据搜索词的长度增加一个额外的数字。也许如果我有 IndexOf 的实现,我可以改进它。但至少这是有效的。

    ' cycle through search words and replace them in the text
    For intWord = LBound(m_arrSearchWords) To UBound(m_arrSearchWords)

       If m_arrSearchWords(intWord).Length > 0 Then

          ' find starting position
          intPos = strText.IndexOf(m_arrSearchWords(intWord), System.StringComparison.CurrentCultureIgnoreCase)
          Do While intPos <> -1

             intOrigLength = m_arrSearchWords(intWord).Length

             ' if there isn't enough of the text left to add the search word length to
             If strText.Length < ((intPos + intOrigLength - 1) - 0 + 1) Then

                ' use shorter length
                intOrigLength = ((strText.Length - 1) - intPos + 1)

             End If

             ' find largest match
             For intLength = intOrigLength To 1 Step -1

                If m_arrSearchWords(intWord).Equals(strText.Substring(intPos, intLength), StringComparison.CurrentCultureIgnoreCase) Then

                   ' if match found, highlight it
                   strText = strText.Substring(0, (intPos - 1) - 0 + 1) & cstrHighlightCodeOn & strText.Substring(intPos, intLength) & cstrHighlightCodeOff & strText.Substring(intPos + intLength)

                   ' find next
                   intPos = strText.IndexOf(m_arrSearchWords(intWord), intPos + intLength + cstrHighlightCodeOn.Length + cstrHighlightCodeOff.Length, System.StringComparison.CurrentCultureIgnoreCase)

                   ' exit search for largest match
                   Exit For

                End If

             Next

             ' if we didn't find it by searching smaller - search larger
             If intLength = 0 Then

                For intLength = intOrigLength + 1 To ((strText.Length - 1) - intPos + 1)

                   If m_arrSearchWords(intWord).Equals(strText.Substring(intPos, intLength), StringComparison.CurrentCultureIgnoreCase) Then

                      ' if match found, highlight it
                      strText = strText.Substring(0, (intPos - 1) - 0 + 1) & cstrHighlightCodeOn & strText.Substring(intPos, intLength) & cstrHighlightCodeOff & strText.Substring(intPos + intLength)

                      ' find next
                      intPos = strText.IndexOf(m_arrSearchWords(intWord), intPos + intLength + cstrHighlightCodeOn.Length + cstrHighlightCodeOff.Length, System.StringComparison.CurrentCultureIgnoreCase)

                      ' exit search for largest match
                      Exit For

                   End If

                Next

             End If

          Loop

       End If

    Next intWord
于 2013-11-12T00:25:23.217 回答
-1

如果我理解正确,您正在寻找一个返回“匹配字符串”的函数 - 换句话说,当您在寻找s1inside时s2,您想确切知道匹配的部分(s2匹配的第一个和最后一个字符的索引)。这允许您突出显示匹配项,并且不会修改字符串(大写/小写、连字等)。

我没有 VB.net,不幸的是,VBA 没有与 VB.net 完全相同的搜索功能 - 所以请理解以下代码正确识别匹配的开始和结束,但它仅使用 upper/ 进行测试小写匹配。我希望这可以帮助您解决问题。

Option Compare Text
Option Explicit

Function startEndIndex(bigString, smallString)
' function that returns start, end index
' of the match
' it keeps shortening the bigString until no match is found
' this is how it takes care of mismatches in number of characters
' because of a match between "similar" strings
Dim i1, i2
Dim shorterString

i2 = 0

' first see if there is a match at all:
i1 = InStr(1, bigString, smallString, vbTextCompare)

If i1 > 0 Then
  ' largest value that i2 can have is end of string:
  i2 = Len(bigString)

  ' can make it shorter - but no shorter than twice the length of the search string
  If i2 > i1 + 2 * Len(smallString) Then i2 = i1 + 2 * Len(smallString)
  shorterString = Mid(bigString, i1, i2 - i1)

  ' keep making the string shorter until there is no match:
  While InStr(1, shorterString, smallString, vbTextCompare) > 0
    i2 = i2 - 1
    shorterString = Mid(bigString, i1, i2 - i1)
  Wend

End If

' return the values as an array:
startEndIndex = Array(i1, endOfString)

End Function


Sub test()
' a simple test routine to see that things work:
Dim a
Dim longString: longString = "This is a very long TaesT of a complicated string"
a = startEndIndex(longString, "very long taest")
If a(0) = 0 And a(1) = 0 Then
MsgBox "no match found"
Else
Dim highlightString As String
highlightString = Left(longString, a(0) - 1) & "*" & Mid(longString, a(0), a(1) - a(0) + 1) & _
  "*" & Mid(longString, a(1) + 1)
  MsgBox "start at " & a(0) & " and end at " & a(1) & vbCrLf & _
  "string matched is '" & Mid(longString, a(0), a(1) - a(0) + 1) & "'" & vbCrLf & _
  "with highlighting: " & highlightString
End If
End Sub
于 2013-11-11T23:23:29.093 回答