XWPFWordExtractor没有像WordExtractor提供的那样提供单独提取脚注的方法。
但是XWPFDocument提供了XWPFDocument.getFootnotes,它返回一个java.util.List<XWPFFootnote>
. 因此,人们可以从中得到单个脚注List
。
例子:
import java.io.FileInputStream;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.xwpf.usermodel.*;
import java.util.List;
import java.util.ArrayList;
public class WordExtracFootnotes {
public static void main(String[] args) throws Exception {
// HWPF - binary *.doc format
WordExtractor extractor = new WordExtractor(new FileInputStream("WordWithFootnotes.doc"));
String[] hwpfFootnotes = extractor.getFootnoteText();
for (String footnote : hwpfFootnotes) {
System.out.println("[" + footnote + "]");
}
extractor.close();
System.out.println();
// XWPF - Office Open XML *.docx format
XWPFDocument document = new XWPFDocument(new FileInputStream("WordWithFootnotes.docx"));
List<XWPFFootnote> xwpfFootnotes = document.getFootnotes();
for (XWPFFootnote footnote : xwpfFootnotes) {
StringBuilder footnoteText = new StringBuilder();
footnoteText.append("[" + footnote.getId() + ":");
boolean first = true;
for (XWPFParagraph paragraph : footnote.getParagraphs()) {
if (!first) footnoteText.append("\n");
first = false;
footnoteText.append(paragraph.getText());
}
footnoteText.append("]");
System.out.println(footnoteText);
}
document.close();
}
}
id -1 和 0 的脚注必须忽略,因为这些脚注仅供内部使用,不会在文档中引用。