我正在使用 perl 脚本将 HTML 邮件转换为纯文本。
当前代码(用于多部分邮件)如下所示:
my $parser = new MIME::Parser;
my $entity = $parser->parse(\*STDIN) or die "parse failed\n";
for my $part ($entity->parts()) {
if ($part->mime_type eq 'text/html') {
my $bh = $part->bodyhandle;
my $tree = HTML::TreeBuilder->new();
$tree->utf8_mode();
$tree->parse($bh->as_string);
my $formatter = HTML::FormatText->new(leftmargin => 0, rightmargin => 72);
my $txt = $formatter->format($tree);
my $txtEntity=MIME::Entity->build(Data => $txt,
Type => "text/plain",
Encoding => "8bit"
);
$entity->add_part($txtEntity,0);
}
}
$entity->print(\*STDOUT);
它可以工作,但它只是将纯文本部分添加到现有部分中,而不是替换 HTML 部分。
所以我想出了这个:
my $head = $entity->head;
my $txtEntity=MIME::Entity->build(Data => $txt,
Type => "text/plain",
Encoding => "8bit",
From => $head->get('From',0),
To => $head->get('To',0),
Subject => $head->get('Subject',0),
Cc => $head->get('Cc',0)
);
$txtEntity->print(\*STDOUT);
但这可能会删除电子邮件标题的某些部分。有没有用纯文本完全替换 HTML 正文的功能?
谢谢!