6

我正在使用 Zend_Mail 解析电子邮件,奇怪的是,某些内容在没有明显原因的情况下被截断,并且电子邮件部分格式不正确。

例如

Content-Disposition: attachment; filename="file.sdv"

DQogICAgICBTT05FO0xBTkRJTkdTREE7U0FMR1NEQVRPIDtOQVNKIDtSRURTS0FQICAgICAgICAg
ICAgIDsgRklTS0VTTEFHO1BSRVNFUlYgICA7ICBUSUxTVEFORDsgU1TYUlJFTFNFOyAgS1ZBTElU
RVQ7T01TVFlQRSAgO01JTlNURVBSSVM7ICAgICBWRVJESTsgICBLVkFOVFVNOyAgUlVORFZFS1Qg
IA0KLS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS07LS0tLS0tLS0tLS0tLS0t
LS0tLS07LS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS0tLTstLS0tLS0t
LS0tOy0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS0tLTstLS0tLS0tLS0t
ICANCiAgICAgICAgIDA7MjAxMC4wOS4wODsyMDEwLjA5LjA4O05vcnNrO0dhcm4gICAgICAgICAg
ICAgICAgOyAgICAgIDEwMjE7RkVSU0sgICAgIDsgICAgICAgMjEwOyAgIDQwMjA5OTk7ICAgICAg
ICAyMDtFZ2Vub3ZlcnQ7ICAgICAgICAgIDsgICAzMDcyLDE2OyAgICAgICAyMTE7ICAgICAyNTMs
MiAgDQogICAgICAgICAwOzIwMTAuMDkuMDg7MjAxMC4wOS4wODtOb3JzaztHYXJuICAgICAgICAg

被截断为

Content-Disposition: attachment; filename="file.sdv"

DQogICAgICBTT05FO0xBTkRJTkdTREE7U0FMR1NEQVRPIDtOQVNKIDtSRURTS0FQICAgICAgICAg
ICAgIDsgRklTS0VTTEFHO1BSRVNFUlYgICA7ICBUSUxTVEFORDsgU1TYUlJFTFNFOyAgS1ZBTElU
RVQ7T01TVFlQRSAgO01JTlNURVBSSVM7ICAgICBWRVJESTsgICBLVkFOVFVNOyAgUlVORFZFS1Qg
IA0KLS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS07LS0tLS0tLS0tLS0tLS0t
LS0tLS07LS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS0tLTstLS0tLS0t
LS

每行上的 var_dump 显示了这一点。

string(78) "DQogICAgICBTT05FO0xBTkRJTkdTREE7U0FMR1NEQVRPIDtOQVNKIDtSRURTS0FQICAgICAgICAg
"
string(78) "ICAgIDsgRklTS0VTTEFHO1BSRVNFUlYgICA7ICBUSUxTVEFORDsgU1TYUlJFTFNFOyAgS1ZBTElU
"
string(78) "RVQ7T01TVFlQRSAgO01JTlNURVBSSVM7ICAgICBWRVJESTsgICBLVkFOVFVNOyAgUlVORFZFS1Qg
"
string(78) "IA0KLS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS07LS0tLS0tLS0tLS0tLS0t
"
string(78) "LS0tLS07LS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS0tLTstLS0tLS0t
"
string(5) "LS)
"
string(17) "TAG5 OK Success
"    

或在其他电子邮件中

DQogICAgICBTT05FO0xBTkRJTkdTREE7U0FMR1NEQVRPIDtOQVNKIDtSRURTS0FQICAgICAgICAg
ICAgIDsgRklTS0VTTEFHO1BSRVNFUlYgICA7ICBUSUxTVEFORDsgU1TYUlJFTFNFOyAgS1ZBTElU
RVQ7T01TVFlQRSAgO01JTlNURVBSSVM7ICAgICBWRVJESTsgICBLVkFOVFVNOyAgUlVORFZFS1Qg
IA0KLS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS07LS0tLS0tLS0tLS0tLS0t
LS0tLS07LS0tLS0tLS0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS0tLTstLS0tLS0t
LS0tOy0tLS0tLS0tLTstLS0tLS0tLS0tO

我不知道为什么停在那里。传输应该只在行尾停止。这是从 IMAP 服务器获取字符串的行。

$line = @fgets($this->_socket);

编码的文本包含一个类似的字符串,但在不同的电子邮件中,这又在不同部分被截断。

----------;----------;----------;-----;--------------------;----------;----------;--

我试图向 fgets() 添加大小但没有结果。我还启用/禁用了“auto_detect_line_endings”php_ini 设置,再次没有结果。

尽管错误似乎不在库中,但我还使用 ZF 打开了错误报告。

你看到这个编码字符串有什么奇怪的地方吗?

更新

新的研究表明,电子邮件在 584 个字符后被截断。还是不知道为什么。也向谷歌发送了一个问题。见这里

错误的电子邮件标题:

Delivered-To: email@removed.com
Received: by 10.216.3.208 with SMTP id 58cs248812weh;
    Fri, 20 Nov 2009 05:14:14 -0800 (PST)
Received: by 10.204.153.217 with SMTP id l25mr1285471bkw.108.1258722853863;
    Fri, 20 Nov 2009 05:14:13 -0800 (PST)
Return-Path: <>
Received: from MTX4.mbn1.net (mtx4.mbn1.net [213.188.129.252])
    by mx.google.com with SMTP id 2si1800716bwz.60.2009.11.20.05.14.12;
    Fri, 20 Nov 2009 05:14:13 -0800 (PST)
Received-SPF: pass (google.com: best guess record for domain of MTX4.mbn1.net designates         213.188.129.252 as permitted sender) client-ip=213.188.129.252;
Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of MTX4.mbn1.net designates 213.188.129.252 as permitted sender) smtp.mail=
Resent-From: <email@removed.com>
Content-Type: multipart/mixed; boundary="===============1703099044=="
MIME-Version: 1.0
From: <email@removed.com>
To: <email@removed.com>
CC:
Subject: some subject
Message-ID: <FLYNDRElQ080Gxw8Zw500000f46email@removed.com>
X-OriginalArrivalTime: 20 Nov 2009 13:14:08.0121 (UTC) FILETIME=[5792C690:01CA69E3]
Date: Fri, 20 Nov 2009 14:14:08 +0100
X-STA-Metric: 0 (engine=030)
X-STA-NotSpam: tlf: vedlagt skip:__ 40 fil cc:2**0
X-STA-Spam: header:MIME-Version: charset:us-ascii header:Subject:1 to:2**0 header:From:1
X-BTI-AntiSpam: score:0,sta:0/030,dnsbl:passed,sw:off,bsn:38/passed,spf:off,bsctr:passed/1,dk:off,pbmf:none,ipr:0/3,trusted:no,ts:no,bs:no,ubl:passed
X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply
Resent-Message-Id: <19740416124736.CF5804B33EF632B0email@removed.com>
Resent-Date: Fri, 20 Nov 2009 14:14:11 +0100 (CET)

--===============1703099044==
Content-Type: application/octet-stream
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="file.sdv"

DQpHUlVQUEVOQVZOICAgICAgICAgIDtLSthQRTtQUk9EQU5MO1BBS0tFTlI7TU9UVEFLTkFWTiAg
ICAgICAgICAgICAgICAgICAgO1NPTjtMQU5ESU5HU0RBO1NBTEdTREFUTyA7TkFTSiA7UkVEU0tB
UCAgIDtGSVNLRVNMQUcgO1BSRVNFUlYgICA7VElMU1RBTkQ7U1TYUlJFTFM7S1ZBTElURVQ7TUlO
U1RFUFJJUzsgICAgICAgIFZFUkRJOyAgICAgS1ZBTlRVTTsgICAgUlVORFZFS1QgICAgDQotLS0t
LS0tLS0tLS0tLS0tLS0tLTstLS0tLTstLS0tLS0tOy0tLS0tLS07LS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tOy0tLTstLS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS07LS0tLS0tLS0tLTst
LS0tLS0tLS0tOy0tLS0tLS0tLS07LS0tLS0tLS07LS0tLS0tLS07LS0tLS0tLS07LS0tLS0tLS0t
LTstLS0tLS0tLS0tLS0tOy0tLS0tLS0tLS0tLTstLS0tLS0tLS0tLS0gICAgDQpMb3JlbnR6ZW4g
....

对于那些对答案感兴趣而不是对(前)赏金感兴趣的人,更多线索。

Gmail 将返回一个短值以响应 RFC822.SIZE,这可能会导致邮件被截断。(对于每个标题行,它们相差一个字节,显然不计算 CR/LF 的两个字符。)

4

5 回答 5

5

我想你找错地方了。

imap 服务器为您提供截断的邮件消息,然后返回其状态行TAG5 OK Success

我看不出您对套接字的(/php)处理如何使价值几 kb 的流消失,从而在此状态行之前神奇地修复流。

因此,要么消息本身被截断(您是否通过其他方式验证了消息内容?)或者 imap 服务器刚刚损坏。

我要做的第一件事是:

  • 找一个足够安静的环境来放置你的项目,在这里你可以通过strace -f -s 10240 -p <pid>apache的进程来验证socket交互(假设是linux/apache环境)
  • 和/或:使用tcpdump,ethereal或等价物来检查上线的内容

我的猜测是,您会在电线上看到完全相同的截断字符串。这意味着您可以将注意力转移到 imap 服务器上。

让自己放心,您正在寻找正确的地方可以节省大量时间。

于 2011-03-25T21:17:58.840 回答
2

1:尝试删除@更多详细信息

2:尝试使用http://www.php.net/manual/en/function.fread.php而不是 fgets

这可能与 IMAP 服务器有关,因为我将TAG5 OK Success其视为响应,即使它不应该存在。

于 2011-03-25T12:40:26.393 回答
0

您是否尝试过发出另一个 fget 并查看是否获得了其余数据?您可能正在检索需要多个请求的多部分电子邮件。

但无论如何,您使用的是为网络上的文件访问而设计的功能。通常这可以正常工作,但根据网络,可能会出现问题。例如,您可以使用 file_get_contents 来检索网页。但是,如果问题发出重定向,那么它就会失败。但是使用 curl 会更成功。

如果你真的想读取网络套接字,你应该尝试 socket_read。这是在设计时考虑到网络的,比如 curl。

于 2011-03-20T00:51:18.250 回答
0

不了解 Zend,忘记了 PHP,但之前玩过 MIME 和 HTTP(C++)。

我建议您开始寻找添加Content-Length标头条目的方法。它为“消息解码器/加载器”提供了一个提示,以期望内容(消息有效负载)中有一定的大小。(不确定 IMAP 是否这样做)

在上面的代码中,我会尝试说服 fgets 从网络中读取特定数量的预期数据。可能是数据已缓冲或尚未通过网络发送(异步通信),并且 fgets 仅读取内部缓冲区,因此在读取整个消息之前停止。

  • 要查看是否是这种情况,请发送一条位于“584 断点”之下的小消息。
  • 做一些网络跟踪,看看所有数据是否真的流动。(您可能需要进行一些本地设置)

你指的代码在这里

于 2011-03-20T01:21:58.633 回答
0

很可能您的服务器硬件之一受到损害,因此您想完全更改它或只是更改 RAM 模块或磁盘驱动器。我对基于 Web 和邮件的编码有一些经验,我可以向您确认 base64 编码的字符串非常安全。至少它使用了纹理映射算法。

于 2011-03-25T10:38:32.417 回答