0

我将安然电子邮件数据集作为一个文件夹,其中包含文本文件形式的电子邮件,我想提取这些电子邮件的“正文”部分

问题是,发件人的电子邮件、收件人的电子邮件等字段由 To:、From: 等指定。但 Body 不以任何标题开头,它只是在指定所有其他字段后才开始。

现在,一个文本文件可以包含许多正文(在电子邮件线程/对话的情况下)。我想从这些文件中提取正文。可以使用javamail api,如果可以,那么如何使用?它只是离线数据集,在我的硬盘驱动器中以文​​本文件的形式,而不是在互联网上。

文件是这样的——

Message-ID: <16159836.1075855377439.JavaMail.evans@thyme>
Date: Fri, 7 Dec 2001 10:06:42 -0800 (PST)
From: heather.dunton@enron.com
To: k..allen@enron.com
Subject: RE: West Position
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-From: Dunton, Heather </O=ENRON/OU=NA/CN=RECIPIENTS/CN=HDUNTON>
X-To: Allen, Phillip K. </O=ENRON/OU=NA/CN=RECIPIENTS/CN=Pallen>
X-cc: 
X-bcc: 
X-Folder: \Phillip_Allen_Jan2002_1\Allen, Phillip K.\Inbox
X-Origin: Allen-P
X-FileName: pallen (Non-Privileged).pst

 
Please let me know if you still need Curve Shift.

Thanks,
Heather
 -----Original Message-----
From: 	Allen, Phillip K.  
Sent:	Friday, December 07, 2001 5:14 AM
To:	Dunton, Heather
Subject:	RE: West Position

Heather,

Did you attach the file to this email?

 -----Original Message-----
From: 	Dunton, Heather  
Sent:	Wednesday, December 05, 2001 1:43 PM
To:	Allen, Phillip K.; Belden, Tim
Subject:	FW: West Position

Attached is the Delta position for 1/16, 1/30, 6/19, 7/13, 9/21


 -----Original Message-----
From: 	Allen, Phillip K.  
Sent:	Wednesday, December 05, 2001 6:41 AM
To:	Dunton, Heather
Subject:	RE: West Position

Heather,

This is exactly what we need.  Would it possible to add the prior day for each of the dates below to the pivot table.  In order to validate the curve shift on the dates below we also need the prior days ending positions.

Thank you,

Phillip Allen

 -----Original Message-----
From: 	Dunton, Heather  
Sent:	Tuesday, December 04, 2001 3:12 PM
To:	Belden, Tim; Allen, Phillip K.
Cc:	Driscoll, Michael M.
Subject:	West Position


Attached is the Delta position for 1/18, 1/31, 6/20, 7/16, 9/24



 << File: west_delta_pos.xls >> 

Let me know if you have any questions.


Heather

4

2 回答 2

0

请提供一个示例文件,如果可能,请提供最复杂的文件。这项工作将以编程方式打开每个文件,解析其内容,并提取电子邮件的正文。那你想把它存放在哪里?您正在运行哪个操作系统?

于 2014-10-08T15:16:54.867 回答
0

如果每个文件都是 MIME 格式的单个消息,则可以使用采用 InputStream 的 JavaMail MimeMessage 构造函数。然后,您可以使用 JavaMail API 来提取消息的内容。请参阅 JavaMail FAQ、javadocs、网站、规范等。

于 2014-10-09T06:46:40.273 回答