27

如何使用 VBA 获取 word 文档中所有标题的列表?

4

7 回答 7

22

您的意思是像这样的 createOutline函数(实际上将所有标题从源 Word 文档复制到新的 Word 文档中):

(我相信 asrHeadings =函数是这个程序的关键,应该允许你检索你想要的东西)_docSource.GetCrossReferenceItems(wdRefTypeHeading)

Public Sub CreateOutline()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the main body of the document, not the headers/footer.        
    Set rng = docOutline.Content
    ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
    astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function

@kol 于 2018 年 3 月 6 日更新

虽然astrHeadings是一个数组(IsArray返回TrueTypeName返回String()) ,type mismatch但当我尝试在 VBScript(Windows 10 Pro 1709 16299.248 上的 v5.8.16384)中访问其元素时出现错误。这一定是 VBScript 特有的问题,因为如果我在 Word 的 VBA 编辑器中运行相同的代码,我就可以访问这些元素。我最终迭代了 TOC 的行,因为它甚至可以在 VBScript 中工作:

For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
  WScript.Echo Paragraph.Range.Text
Next
于 2008-11-08T15:30:04.307 回答
18

获取标题列表的最简单方法是遍历文档中的段落,例如:

 Sub ReadPara()

    Dim DocPara As Paragraph

    For Each DocPara In ActiveDocument.Paragraphs

     If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then

       Debug.Print DocPara.Range.Text

     End If

    Next


End Sub

顺便说一句,我发现删除段落范围的最后一个字符是个好主意。否则,如果将字符串发送到消息框或文档,Word 会显示一个额外的控制字符。例如:

Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)
于 2008-11-09T20:07:54.170 回答
2

这个宏对我来说效果很好(Word 2010)。我稍微扩展了功能:现在它提示用户输入最低级别,并在该级别以下抑制子标题。

Public Sub CreateOutline()
' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim intLevel As Integer
    Dim intItem As Integer
    Dim minLevel As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    minLevel = 1  'levels above this value won't be copied.
    minLevel = CInt(InputBox("This macro will generate a new document that contains only the headers from the existing document. What is the lowest level heading you want?", "2"))

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        If intLevel <= minLevel Then

            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine

            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        End If
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim intDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    intDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (intDiff / 2) + 1
End Function
于 2012-08-07T04:19:04.873 回答
1

提取所有标题的最快方法(到 LEVEL5)。

Sub EXTRACT_HDNGS()
Dim WDApp As Word.Application    'WORD APP
Dim WDDoc As Word.Document       'WORD DOC

Set WDApp = Word.Application
Set WDDoc = WDApp.ActiveDocument

For Head_n = 1 To 5
Head = ("Heading " & Head_n)
WDApp.Selection.HomeKey wdStory, wdMove

    Do
       With WDApp.selection
      .MoveStart Unit:=wdLine, Count:=1    
      .Collapse Direction:=wdCollapseEnd
       End with
        With WDApp.Selection.Find
          .ClearFormatting:          .text = "":     
          .MatchWildcards = False:   .Forward = True
          .Style = WDDoc.Styles(Head)
         If .Execute = False Then GoTo Level_exit
            .ClearFormatting
        End With

       Heading_txt = RemoveSpecialChar(WDApp.Selection.Range.text, 1):              Debug.Print Heading_txt
       Heading_lvl = WDApp.Selection.Range.ListFormat.ListLevelNumber:              Debug.Print Heading_lvl
       Heading_lne = WDDoc.Range(0, WDApp.Selection.Range.End).Paragraphs.Count:    Debug.Print Heading_lne
       Heading_pge = WDApp.Selection.Information(wdActiveEndPageNumber):            Debug.Print Heading_pge

       If Wdapp.Selection.Style = "Heading 1" Then GoTo Level_exit
       Wdapp.Selection.Collapse Direction:=wdCollapseStart
   Loop
Level_exit:
Next Head_n

End Sub
于 2014-01-27T14:10:52.507 回答
1

在 Wikis 对 VonC 答案的评论之后,这里是对我有用的代码。它使函数更快。

Public Sub CopyHeadingsInNewDoc()
    Dim docOutline As Word.Document
    Dim docSource As Word.Document
    Dim rng As Word.Range

    Dim astrHeadings As Variant
    Dim strText As String
    Dim longLevel As Integer
    Dim longItem As Integer

    Set docSource = ActiveDocument
    Set docOutline = Documents.Add

    ' Content returns only the
    ' main body of the document, not
    ' the headers and footer.
    Set rng = docOutline.Content
    astrHeadings = _
     docSource.GetCrossReferenceItems(wdRefTypeHeading)

    For intItem = LBound(astrHeadings) To UBound(astrHeadings)
        ' Get the text and the level.
        strText = Trim$(astrHeadings(intItem))
        intLevel = GetLevel(CStr(astrHeadings(intItem)))

        ' Add the text to the document.
        rng.InsertAfter strText & vbNewLine

        ' Set the style of the selected range and
        ' then collapse the range for the next entry.
        rng.Style = "Heading " & intLevel
        rng.Collapse wdCollapseEnd
    Next intItem
End Sub

Private Function GetLevel(strItem As String) As Integer
    ' Return the heading level of a header from the
    ' array returned by Word.

    ' The number of leading spaces indicates the
    ' outline level (2 spaces per level: H1 has
    ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.

    Dim strTemp As String
    Dim strOriginal As String
    Dim longDiff As Integer

    ' Get rid of all trailing spaces.
    strOriginal = RTrim$(strItem)

    ' Trim leading spaces, and then compare with
    ' the original.
    strTemp = LTrim$(strOriginal)

    ' Subtract to find the number of
    ' leading spaces in the original string.
    longDiff = Len(strOriginal) - Len(strTemp)
    GetLevel = (longDiff / 2) + 1
End Function
于 2015-02-06T10:42:05.530 回答
1

为什么要重新发明轮子这么多次?!?

“所有标题列表”只是文档的标准 Word 索引!

这是我在向文档添加索引时录制宏得到的:

Sub Macro1()
    ActiveDocument.TablesOfContents.Add Range:=Selection.Range, _
        RightAlignPageNumbers:=True, _
        UseHeadingStyles:=True, _
        UpperHeadingLevel:=1, _
        LowerHeadingLevel:=5, _
        IncludePageNumbers:=True, _
        AddedStyles:="", _
        UseHyperlinks:=True, _
        HidePageNumbersInWeb:=True, _
        UseOutlineLevels:=True
End Sub
于 2017-01-13T08:23:46.953 回答
0

您还可以在文档中创建目录并复制它。这将 para ref 从标题中分离出来,如果您需要在另一个上下文中呈现它,这很方便。如果您不希望文档中包含 ToC,只需在 Copy n Paste 之后将其删除。JK。

于 2012-02-28T16:47:44.140 回答