6

所以,我才刚刚开始了解正则表达式,我发现学习曲线相当陡峭。但是,stackoverflow 在我的实验过程中非常有帮助。我想写一个特定的单词宏,但我还没有找到方法。我希望能够在文档中找到彼此相距 10 个左右的单词,然后将这些单词斜体化,如果这些单词相隔超过 10 个单词或顺序不同,我希望宏不要斜体那些话。

我一直在使用以下正则表达式:

\bPanama\W+(?:\w+\W+){0,10}?Canal\b

但是,它只能让我将整个字符串作为一个整体进行操作,包括中间的随机单词。此外 .Replace 函数只允许我用不同的字符串替换该字符串,而不会更改格式样式。

有没有经验更丰富的人知道如何完成这项工作?甚至有可能做到吗?


编辑:这是我到目前为止所拥有的。我有两个问题。首先,我不知道如何从匹配的正则表达式中只选择“巴拿马”和“运河”这两个词,然后只替换这些词(而不是中间词)。其次,我只是不知道如何替换与不同格式匹配的正则表达式,只有不同的文本字符串 - 可能只是因为不熟悉单词宏。

Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Set re = New regExp
re.Pattern = "\bPanama\W+(?:\w+\W+){0,10}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs
  Set rng = para.Range
  rng.MoveEnd unit:=wdCharacter, Count:=-1
  Text$ = rng.Text + "Modified"
  rng.Text = re.Replace(rng.Text, Text$)
Next para
End Sub

好的,感谢下面蒂姆威廉姆斯的帮助,我得到了以下解决方案,在某些方面它有点笨拙,它绝不是纯正则表达式,但它确实完成了工作。如果有人对如何解决这个问题有更好的解决方案或想法,我会很高兴听到它。同样,我通过搜索和替换功能强制更改有点令人尴尬,但至少它有效......

Sub RegText()
Dim re As regExp
Dim para As Paragraph
Dim rng As Range
Dim txt As String
Dim allmatches As MatchCollection, m As match
Set re = New regExp
re.pattern = "\bPanama\W+(?:\w+\W+){0,13}?Canal\b"
re.IgnoreCase = True
re.Global = True
For Each para In ActiveDocument.Paragraphs

  txt = para.Range.Text

  'any match?
  If re.Test(txt) Then
    'get all matches
    Set allmatches = re.Execute(txt)
    'look at each match and hilight corresponding range
    For Each m In allmatches
        Debug.Print m.Value, m.FirstIndex, m.Length
        Set rng = para.Range
        rng.Collapse wdCollapseStart
        rng.MoveStart wdCharacter, m.FirstIndex
        rng.MoveEnd wdCharacter, m.Length
        rng.Font.ColorIndex = wdOrange
    Next m
  End If

Next para

Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
    .Text = "Panama"
    .Replacement.Text = "Panama"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.Italic = True
With Selection.Find
    .Text = "Canal"
    .Replacement.Text = "Canal"
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll

Selection.Find.ClearFormatting
Selection.Find.Font.ColorIndex = wdOrange
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Font.ColorIndex = wdBlack
With Selection.Find
    .Text = ""
    .Replacement.Text = ""
    .Forward = True
    .Wrap = wdFindContinue
    .Format = True
    .MatchCase = False
    .MatchWholeWord = False
    .MatchWildcards = False
    .MatchSoundsLike = False
    .MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
End Sub
4

2 回答 2

6

我离成为一个体面的 Word 程序员还有很长的路要走,但这可能会让你开始。

编辑:更新以包含参数化版本。

Sub Tester()

    HighlightIfClose ActiveDocument, "panama", "canal", wdBrightGreen
    HighlightIfClose ActiveDocument, "red", "socks", wdRed

End Sub


Sub HighlightIfClose(doc As Document, word1 As String, _
                     word2 As String, clrIndex As WdColorIndex)
    Dim re As RegExp
    Dim para As Paragraph
    Dim rng As Range
    Dim txt As String
    Dim allmatches As MatchCollection, m As match

    Set re = New RegExp
    re.Pattern = "\b" & word1 & "\W+(?:\w+\W+){0,10}?" _
                 & word2 & "\b"
    re.IgnoreCase = True
    re.Global = True

    For Each para In ActiveDocument.Paragraphs

      txt = para.Range.Text

      'any match?
      If re.Test(txt) Then
        'get all matches
        Set allmatches = re.Execute(txt)
        'look at each match and hilight corresponding range
        For Each m In allmatches
            Debug.Print m.Value, m.FirstIndex, m.Length
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex
            rng.MoveEnd wdCharacter, Len(word1)
            rng.HighlightColorIndex = clrIndex
            Set rng = para.Range
            rng.Collapse wdCollapseStart
            rng.MoveStart wdCharacter, m.FirstIndex + (m.Length - Len(word2))
            rng.MoveEnd wdCharacter, Len(word2)
            rng.HighlightColorIndex = clrIndex
        Next m
      End If

    Next para

End Sub
于 2012-07-06T23:11:14.777 回答
0

如果您一次只做两个单词,那么这对我有用,遵循您的练习路线。

foo([a-zA-Z0-9]+? ){0,10}bar

解释: 将抓取单词 1 ( foo),然后匹配任何由字母数字字符 ( [a-zA-Z0-9]+?) 后跟空格 ( ) 组成的单词,10 次 ( {0,10}),然后是单词 2 ( bar)。

包括句号(不知道您是否想要它们),但如果您只想在正则表达式中添加.after 0-9

所以你的(伪代码)语法将类似于

$matches = preg_match_all(); // Your function to get regex matches in an array

foreach (those matches) {
    replace(KEY_WORD, <i>KEY_WORD</i>);
}

希望它有所帮助。下面的测试,突出了它匹配的内容。


工作过:

foo this that bar废话_

foo economic order war bar

没用

富经济秩序。战吧

全球 foo 秩序已经存在了几个世纪,在这段时间里,人们发展出不同而复杂的贸易关系,以应对农业和酒吧等情况

于 2012-07-06T05:34:24.630 回答