0

What i need to do is... Read a word file and based on the properties of the font add a tag ahead of them to differentiate it as a header or a paragraph However, i need to do this using Perl.. Is it possible??? Any help will be appreciated. Thanks!!

4

3 回答 3

4

@Nikita,这将为您提供有关其完成方式的详细视图:

#!/usr/bin/perl
use strict;
use warnings;
use Win32::OLE::Const 'Microsoft Word';
#$Win32::OLE::CP = CP_UTF8;
binmode STDOUT, 'encoding(utf8)';

# OPEN FILE SPECIFIED AS FIRST ARGUMENT
my $fname=$ARGV[0]; 
my $fnameFullPath = `cygpath.exe -wa $fname`;
$fnameFullPath =~ s/\\/\\\\/g;
$fnameFullPath =~ s/\s*$//;
unless (-e $fnameFullPath) { print "Error: File did not exists\n"; exit 1;}

# STARTING OLE
my $Word = Win32::OLE->GetActiveObject('Word.Application')
    || Win32::OLE->new('Word.Application','Quit')
    or die Win32::OLE->LastError();

$Word->{'Visible'} = 0;
my $doc = $Word->Documents->Open($fnameFullPath);
my $paragraphs = $doc->Paragraphs() ;
my $enumerate = new Win32::OLE::Enum($paragraphs);

# PROCESSING PARAGRAPHS
while(defined(my $paragraph = $enumerate->Next())) {

    my $text = $paragraph->{Range}->{Text};
    my $sel = $Word->Selection;
    my $font = $sel->Font;

    if ($font->{Size} == 18){
        print "Text: ", $text, "\n";
        print "Font Bold: ", $font->{Bold}, "\n";
        print "Font Italic: ", $font->{Italic}, "\n";
        print "Font Name: ", $font->{Name}, "\n";
        print "Font Size: ", $font->{Size}, "\n";
        print "=========\n";
    }
}

# CLOSING OLE
$Word->ActiveDocument->Close ;
$Word->Quit;

输出将是这样的:

文本:这是一个包含不同字体和大小的文档文件,文档还包含页眉和页脚(字体:TNR,大小:18)
字体粗体:0
字体斜体:0
字体名称:Times New Roman
字体大小:18
=========
文本:这是一个 Perl 示例(字体 TNR,大小:12)
字体粗体:0
字体斜体:0
字体名称:Times New Roman
字体大小:18
=========
文本:这是一个 Python 示例(字体:Courier New,大小:10)
字体粗体:0
字体斜体:0
字体名称:Times New Roman
字体大小:18
=========
于 2013-03-21T13:55:16.387 回答
3

我需要更多信息来帮助您识别需要处理的单词。在我的示例中,我只是搜索文本Some这是我的 *.docx 文件

#!/usr/bin/perl

use Modern::Perl;
use Win32::OLE;

use Win32::OLE qw(in with);
use Win32::OLE::Variant;
use Win32::OLE::Const 'Microsoft Word';
$Win32::OLE::Warn = 3;

print "Starting Word\n";

    my $Word = Win32::OLE->GetActiveObject('Word.Application') ||
           Win32::OLE->new('Word.Application');
    $Word->{'Visible'}     = 1;
    $Word->{DisplayAlerts} = 0;

my $File = $Word->Documents->Open( "./fonts.docx" ) or die Win32::OLE->LastError;

$Word->Selection->HomeKey(wdStory);

$Word->Selection->Find->{'Text'} = 'Some';

$Word->Selection->Find->Execute();

say "Font size: [", $Word->Selection->Font->Size(), "]";
say "Font name: [", $Word->Selection->Font->Name(), "]";

$Word->Quit;
于 2013-03-21T13:57:45.523 回答
0

尝试使用 OLE 自动化,Win32::OLE模块很有帮助。这种方式需要对 Word OLE api 有更深入的了解。

于 2013-03-21T12:34:30.753 回答