如果我使用 ascii 查询字符串,Gmail 的 imap 扩展命令 X-GM-RAW 允许我执行搜索。如果查询中使用了 utf-8 字符,则 imap 会返回错误响应。
应该如何对 utf-8 输入字符串进行编码,以便 X-GM-RAW 搜索能够正常工作。我不想失去搜索“主题”或“rfc833msgid”等特定字段的灵活性
谢谢
如果我使用 ascii 查询字符串,Gmail 的 imap 扩展命令 X-GM-RAW 允许我执行搜索。如果查询中使用了 utf-8 字符,则 imap 会返回错误响应。
应该如何对 utf-8 输入字符串进行编码,以便 X-GM-RAW 搜索能够正常工作。我不想失去搜索“主题”或“rfc833msgid”等特定字段的灵活性
谢谢
指定 CHARSET UTF-8 并以文字形式发送 UTF-8 搜索词。例如,要搜索 UTF-8 编码的 6 字节长的你好:
A SEARCH CHARSET UTF-8 X-GM-RAW {6}
+ go ahead
你好
* SEARCH 15
a OK SEARCH completed (Success)
在此示例中,您实际上将在第三行发送 6 字节的 UTF-8 编码的你好。
这适用于任何接受字符串的 SEARCH 关键字,包括 SUBJECT 和 HEADER MESSAGE-ID。
IMAP isn't 8-bit clean, so it has to use a variety of different encodings to represent any 8-bit data.
For things like folders and labels IMAP4 uses Modified UTF-7 to represent these characters. Conveniently, ascii data encoded in modified utf7 encodes as itself, so normally nothing special needs to be done.
For message headers (including subjects) the text is encoded as Mime words.
And finally atttachments are generally encoded as either Base64 or Quoted-Printable
My best guess is that GMail uses modified utf7 for their X-GM-RAW queries. The best reference implementation for modified utf7 I've found is in the IMAPClient python library
Hope this helps!