3

这个问题本来很简单,但事实证明,添加了一个额外的条款让我很头疼。这里的问题是我不需要所有突出显示的“单词”,而是 Word 文件中的“短语”。我写了以下代码:

using Word = Microsoft.Office.Interop.Word;

private void button1_Click(object sender, EventArgs e)
{
    try
    {
        Word.ApplicationClass wordObject = new Word.ApplicationClass();
        wordObject.Visible = false;
        object file = "D:\\mywordfile.docx";
        object nullobject = System.Reflection.Missing.Value;
        Word.Document thisDoc = wordObject.Documents.Open(ref file, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject);
        List<string> wordHighlights = new List<string>();

        //Let myRange be some Range which has my text under consideration

        int prevStart = 0;
        int prevEnd = 0;
        int thisStart = 0;
        int thisEnd = 0;
        string tempStr = "";
        foreach (Word.Range cellWordRange in myRange.Words)
        {
            if (cellWordRange.HighlightColorIndex.ToString() == "wdNoHighlight")
            {
                continue;
            }
            else
            {
                thisStart = cellWordRange.Start;
                thisEnd = cellWordRange.End;
                string cellWordText = cellWordRange.Text.Trim();
                if (cellWordText.Length >= 1)   // valid word length, non-whitespace
                {
                    if (thisStart == prevEnd)    // If this word is contiguously highlighted with previous highlighted word
                    {
                        tempStr = String.Concat(tempStr, " "+cellWordText);  // Concatenate with previous contiguously highlighted word
                    }
                    else
                    {
                        if (tempStr.Length > 0)    // If some string has been concatenated in previous iterations
                        {
                            wordHighlights.Add(tempStr);
                        }
                        tempStr = "";
                        tempStr = cellWordText;
                    }
                }
                prevStart = thisStart;
                prevEnd = thisEnd;
            }
        }

        foreach (string highlightedString in wordHighlights)
        {
            MessageBox.Show(highlightedString);
        }
    }
    catch (Exception j)
    {
        MessageBox.Show(j.Message);
    }
}

现在考虑以下文本:

Le thé vert a un rôle dans la diminution du cholestérol, la burning des graisses, la prevention du diabète et les AVC, et conjurer la démence。

现在假设有人突出显示“ du cholestérol ”,我的代码显然选择了两个词“ du ”和“ cholestérol ”。如何使连续突出显示的区域显示为一个单词?我的意思是“ du cholestérol ”应该作为一个实体返回List。我们按字符扫描文档,将突出显示的起点标记为选择的起点,将突出显示的终点标记为选择的终点的任何逻辑?

PS:如果有任何其他语言的所需功能的库,请让我知道,因为场景不是特定于语言的。我只需要以某种方式获得所需的结果。

编辑:按照 Oliver Hanappi 的建议Start修改了代码。End但问题仍然在于,如果有两个这样的突出显示的短语,仅由一个空格分隔,则程序将两个短语视为一个。仅仅因为它读取Words而不是空格。可能需要一些编辑if (thisStart == prevEnd)吗?

4

4 回答 4

2

您可以使用 Find 更有效地执行此操作,这将更快地搜索并选择所有匹配的连续文本。请参阅此处的参考http://msdn.microsoft.com/en-us/library/office/bb258967%28v=office.12%29.aspx

这是 VBA 中的一个示例,它打印所有出现的突出显示文本:

Sub TestFind()

  Dim myRange As Range

  Set myRange = ActiveDocument.Content    '    search entire document

  With myRange.Find

    .Highlight = True

    Do While .Execute = True     '   loop while highlighted text is found

      Debug.Print myRange.Text   '   myRange is changed to contain the found text

    Loop

  End With

End Sub

希望这可以帮助您更好地理解。

于 2013-03-20T14:12:20.070 回答
1

您可以查看范围的StartEnd属性,并检查第一个范围的结束是否等于第二个范围的开始。

作为替代方案,您可以范围移动一个单词(参见 WdUnits.wdWord),然后检查移动的开始和结束是否等于第二个单词的开始和结束。

于 2013-03-20T13:29:31.633 回答
0

grahamj42 答案还可以,我已将其翻译为 C#。如果要在整个文档中查找匹配项,请使用:

Word.Range content = thisDoc.Content

但请记住,这只是 mainStoryRange,如果你想匹配单词,例如你需要使用的脚注:

Word.StoryRanges stories = null;
stories = thisDoc.StoryRanges;
Word.Range footnoteRange = stories[Word.WdStoryType.wdFootnotesStory]

我的代码:

Word.Find find = null;
Word.Range duplicate = null;
try
{
    duplicate = range.Duplicate;
    find = duplicate.Find;
    find.Highlight = 1;

    object str = "";
    object missing = System.Type.Missing;
    object objTrue = true;
    object replace = Word.WdReplace.wdReplaceNone;

    bool result = find.Execute(ref str, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue, ref str, ref replace, ref missing, ref missing, ref missing, ref missing);
    while (result)
    {
        // code to store range text
        // use duplicate.Text property
        result = find.Execute(ref str, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, ref objTrue, ref str, ref replace, ref missing, ref missing, ref missing, ref missing);
    }
}
finally
{
    if (find != null) Marshal.ReleaseComObject(find);
    if (duplicate != null) Marshal.ReleaseComObject(duplicate);
}
于 2013-04-04T14:06:20.100 回答
-1

我从 Oliver 的逻辑开始,事情似乎很好,但测试表明这种方法没有考虑空格。因此,仅由空格分隔的突出显示的短语并没有分开。我使用了 grahamj42 提供的 VB 代码方法,并将其添加为类库,并将引用包含在我的 C# windows 窗体项目中。

我的 C# Windows 窗体项目:

using Word = Microsoft.Office.Interop.Word;

然后我将try块更改为:

Word.ApplicationClass wordObject = new Word.ApplicationClass();
wordObject.Visible = false;
object file = "D:\\mywordfile.docx";
object nullobject = System.Reflection.Missing.Value;
Word.Document thisDoc = wordObject.Documents.Open(ref file, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject, ref nullobject);

List<string> wordHighlights = new List<string>();


// Let myRange be some Range, which has been already selected programatically here


WordMacroClasses.Highlighting macroObj = new WordMacroClasses.Highlighting();
List<string> hiWords = macroObj.HighlightRange(myRange, myRange.End);
foreach (string hitext in hiWords)
{
    wordHighlights.Add(hitext);
}

这是Range.FindVB 类库中的代码,它简单地接受Rangeand 它Range.Last并返回 a List(Of String)

Public Class Highlighting
    Public Function HighlightRange(ByVal myRange As Microsoft.Office.Interop.Word.Range, ByVal rangeLimit As Integer) As List(Of String)

        Dim Highlights As New List(Of String)
        Dim i As Integer
        i = 0

        With myRange.Find
            .Highlight = True
            Do While .Execute = True     ' loop while highlighted text is found

                If (myRange.Start < rangeLimit) Then Highlights.Add(myRange.Text)

            Loop
        End With
        Return Highlights
    End Function
End Class
于 2013-03-22T10:44:26.483 回答