1

我正在使用下面的程序对电子邮件进行排序并最终打印出来。某些邮件可能包含不利于打印的附件或 HTML 代码。有没有一种简单的方法可以从邮件中去除附件并去除 HTML,但不去除由 HTML 格式化的文本?

#!/usr/bin/perl
use warnings;
use strict;
use Mail::Box::Manager;

open (MYFILE, '>>data.txt');
binmode(MYFILE, ':encoding(UTF-8)');


my $file = shift || $ENV{MAIL};
my $mgr = Mail::Box::Manager->new(
    access          => 'r',
);

my $folder = $mgr->open( folder => $file )
or die "$file: Unable to open: $!\n";

for my $msg ( sort { $a->timestamp <=> $b->timestamp } $folder->messages)
{
    my $to          = join( ', ', map { $_->format } $msg->to );
    my $from        = join( ', ', map { $_->format } $msg->from );
    my $date        = localtime( $msg->timestamp );
    my $subject     = $msg->subject;
    my $body        = $msg->decoded->string;

    # Strip all quoted text
    $body =~ s/^>.*$//msg;

    print MYFILE <<"";
From: $from
To: $to
Date: $date
Subject: $subject
\n
$body

}
4

4 回答 4

3

Mail::Message::isMultipart将告诉您给定消息是否有任何附件。 Mail::Message::parts会给你一个邮件部分的列表。

因此:

if ( $msg->isMultipart ) {
    foreach my $part ( $msg->parts ) {
        if ( $part->contentType eq 'text/html' ) {
           # deal with html here.
        }
        elsif ( $part->contentType eq 'text/plain' ) {
           # deal with text here.
        }
        else {
           # well?
        }
    }
}
于 2008-12-16T12:43:18.883 回答
1

剥离 HTML 方面在常见问题解答 #9(或 中的第一项perldoc -q html)中进行了说明。简单来说,相关的模块是 HTML::Parser 和 HTML::FormatText。

至于附件,带有附件的电子邮件作为 MIME 发送。从这个示例中,您可以看到格式非常简单,您可以相当容易地提出解决方案,或者检查CPAN 的 MIME 模块

于 2008-12-16T12:39:13.510 回答
0

看起来有人已经在 linuxquestions 论坛上解决了这个问题

来自论坛:

            # This is part of Mail::POP3Client to get the headers and body of the POP3 mail in question
            $body = $connection->HeadAndBody($i);
            # Parse the message with MIME::Parser, declare the body as an entitty
            $msg = $parser->parse_data($body);
            # Find out if this is a multipart MIME message or just a plaintext
            $num_parts=$msg->parts;
            # So its its got 0 parts i.e. is a plaintext
            if ($num_parts eq 0) {
            # Get the message by POP3Client
            $message = $connection->Body($i);
            # Use this series of regular expressions to verify that its ok for MySQL
            $message =~ s/</&lt;/g;
            $message =~ s/>/&gt;/g;
            $message =~ s/'//g;
                                  }
            else {
                  # If it is MIME the parse the first part (the plaintext) into a string
                 $message = $msg->parts(0)->bodyhandle->as_string;
                  }
于 2008-12-16T12:46:59.780 回答
0

你在 perl Mail-Box-2.117 中有一个完整的例子:

http://cpansearch.perl.org/src/MARKOV/Mail-Box-2.117/examples/strip-attachments.pl

于 2014-11-26T23:33:16.887 回答