What i need to do is... Read a word file and based on the properties of the font add a tag ahead of them to differentiate it as a header or a paragraph However, i need to do this using Perl.. Is it possible??? Any help will be appreciated. Thanks!!
问问题
766 次
3 回答
4
@Nikita,这将为您提供有关其完成方式的详细视图:
#!/usr/bin/perl
use strict;
use warnings;
use Win32::OLE::Const 'Microsoft Word';
#$Win32::OLE::CP = CP_UTF8;
binmode STDOUT, 'encoding(utf8)';
# OPEN FILE SPECIFIED AS FIRST ARGUMENT
my $fname=$ARGV[0];
my $fnameFullPath = `cygpath.exe -wa $fname`;
$fnameFullPath =~ s/\\/\\\\/g;
$fnameFullPath =~ s/\s*$//;
unless (-e $fnameFullPath) { print "Error: File did not exists\n"; exit 1;}
# STARTING OLE
my $Word = Win32::OLE->GetActiveObject('Word.Application')
|| Win32::OLE->new('Word.Application','Quit')
or die Win32::OLE->LastError();
$Word->{'Visible'} = 0;
my $doc = $Word->Documents->Open($fnameFullPath);
my $paragraphs = $doc->Paragraphs() ;
my $enumerate = new Win32::OLE::Enum($paragraphs);
# PROCESSING PARAGRAPHS
while(defined(my $paragraph = $enumerate->Next())) {
my $text = $paragraph->{Range}->{Text};
my $sel = $Word->Selection;
my $font = $sel->Font;
if ($font->{Size} == 18){
print "Text: ", $text, "\n";
print "Font Bold: ", $font->{Bold}, "\n";
print "Font Italic: ", $font->{Italic}, "\n";
print "Font Name: ", $font->{Name}, "\n";
print "Font Size: ", $font->{Size}, "\n";
print "=========\n";
}
}
# CLOSING OLE
$Word->ActiveDocument->Close ;
$Word->Quit;
输出将是这样的:
文本:这是一个包含不同字体和大小的文档文件,文档还包含页眉和页脚(字体:TNR,大小:18) 字体粗体:0 字体斜体:0 字体名称:Times New Roman 字体大小:18 ========= 文本:这是一个 Perl 示例(字体 TNR,大小:12) 字体粗体:0 字体斜体:0 字体名称:Times New Roman 字体大小:18 ========= 文本:这是一个 Python 示例(字体:Courier New,大小:10) 字体粗体:0 字体斜体:0 字体名称:Times New Roman 字体大小:18 =========
于 2013-03-21T13:55:16.387 回答
3
我需要更多信息来帮助您识别需要处理的单词。在我的示例中,我只是搜索文本Some(这是我的 *.docx 文件)
#!/usr/bin/perl
use Modern::Perl;
use Win32::OLE;
use Win32::OLE qw(in with);
use Win32::OLE::Variant;
use Win32::OLE::Const 'Microsoft Word';
$Win32::OLE::Warn = 3;
print "Starting Word\n";
my $Word = Win32::OLE->GetActiveObject('Word.Application') ||
Win32::OLE->new('Word.Application');
$Word->{'Visible'} = 1;
$Word->{DisplayAlerts} = 0;
my $File = $Word->Documents->Open( "./fonts.docx" ) or die Win32::OLE->LastError;
$Word->Selection->HomeKey(wdStory);
$Word->Selection->Find->{'Text'} = 'Some';
$Word->Selection->Find->Execute();
say "Font size: [", $Word->Selection->Font->Size(), "]";
say "Font name: [", $Word->Selection->Font->Name(), "]";
$Word->Quit;
于 2013-03-21T13:57:45.523 回答
0
尝试使用 OLE 自动化,Win32::OLE模块很有帮助。这种方式需要对 Word OLE api 有更深入的了解。
于 2013-03-21T12:34:30.753 回答