我在我的 vb.net VSTO 中设计并构建了一个算法,可以将任何文本分解成句子。语言专家使用它来确定文本的可读性(我也在自动化可读性分析)。我不依赖通配符查找,因为它太有限了。该算法必须处理引用的文本(即使是较大句子的一部分)、列表和标题。
到目前为止,一切都很好。该算法按预期工作。
我认为通过突出显示文档中的各种文本以颜色显示“将文本分解为句子”算法的结果是有用的。这样做我遇到了 Word 的行为不一致(在 Win10 上使用 Office 2016 Professional 32 位)。在我设计解决方案之前,我想分享一下,看看是否有人可以提供更多见解。我错过了什么吗?
在 table 之外,我可以将范围设置为任何文本,然后设置 .HighlightColorIndex 属性,颜色会更改并在 Word 编辑器中可见。
在表格单元格内,只要文本后面没有段落标记(vbCr)(我的范围内不包括vbCr),它的工作原理相同。在这种情况下,当 .HighlightColorIndex 更改(在调试器中确认)时,Word 编辑器中的颜色不会明显更改。只有当我在我的范围中包含段落标记时它才有效。在不需要的表之外。
基本代码流(部分非代码),为清楚起见带有一些附加注释
For each para as Paragraph in selectedRange.paragraphs
' Identify a sentence by looping the para.range.text
' looking at punctuation marks, quotes, abbreviations, false positives etc.
... complicated logic ...
' If sentence identied, find it so we have the right range.
' This find is actually in a Sub FindSentence (rng, sentence)
' shown here for readability.
' The sub includes some complicated logic to overcome the Find text length limit.
para.range.find (sentence.text)
if para.range.found then
' This code is actually in Sub Highlight(rng, sentence)
' shown here in the main code for readability.
' The debugger shows that the properties are changed.
' If range is in a Table then highlight is not shown in Word
' unless the found para.range includes the vbCr.
' The exact same logic works fine when text is not in a table.
' The table behaviour is problematic because a table cell
' can contain multiple sentences and only the last one would have the vbCr.
Select Case sentence.type
Case EtSentenceType.Normal
rng.HighlightColorIndex = WdColorIndex.wdGray25
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Question
rng.HighlightColorIndex = WdColorIndex.wdBrightGreen
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Exclamation
rng.HighlightColorIndex = WdColorIndex.wdRed
rng.Font.ColorIndex = WdColorIndex.wdWhite
Case EtSentenceType.SingleQuote, EtSentenceType.DoubleQuote
rng.HighlightColorIndex = WdColorIndex.wdTurquoise
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.Heading
rng.HighlightColorIndex = WdColorIndex.wdYellow
rng.Font.ColorIndex = WdColorIndex.wdAuto
Case EtSentenceType.List
rng.HighlightColorIndex = WdColorIndex.wdPink
rng.Font.ColorIndex = WdColorIndex.wdAuto
End Select
end if
End loop
Next