-1

如何使用 Linux 支持中的 Perl 模块从 MS Word 文档中读取数据

4

2 回答 2

3

Text::Extract::Word看起来可能是一个不错的起点。来自简介:

# object-based interface
use Text::Extract::Word;
my $file = Text::Extract::Word->new("test1.doc");
my $text = $file->get_text();
my $body = $file->get_body();
my $footnotes = $file->get_footnotes();
my $headers = $file->get_headers();
my $annotations = $file->get_annotations();
my $bookmarks = $file->get_bookmarks();

# specify :raw if you don't want the text cleaned
my $raw = $file->get_text(':raw');

# legacy interface
use Text::Extract::Word qw(get_all_text);
my $text = get_all_text("test1.doc");
于 2013-03-25T13:43:06.577 回答
0

我将 OLE 用于 Word、Excel 和 Outlook:

require Win32::OLE;
$docfile = "C:\\something.doc";
$Word = Win32::OLE->GetActiveObject('Word.Application');
unless ($Word) { $Word = Win32::OLE->new('Word.Application', sub {$_[0]->Quit;}) or die "oops\n"; }
$Word->{visible} = 1;
my $File = $Word->Documents->Open($docfile);
$File->PrintOut();
$File->Close(); 
$Word->Quit();
undef $File;
undef $Word;
于 2013-03-25T13:48:34.010 回答