-1

那里!

我想匹配一个邮件主题中的所有内联编码并在 utf8 中构建主题字符串。

一些例子:

[Listname | Topic123] =?utf-8?Q?encodedtext?=
=?iso-8859-1?q?this=20is=20some=20text?=
Klartext-Betreff
[Listname | Topic123] =?utf-8?Q?encodedtext?= =?iso-8859-1?q?this=20is=20some=20text?=
=?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
    =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=

我还收到了一封带有两种不同编码的邮件(最后一行中的示例)。

在电子邮件中,也可以将主题拆分为多行,其中每一行(第一行除外)以至少一个空格开头

所以我正在寻找一个正则表达式,它解析:

部分+

其中 Part 是以下之一:

  • 带空格的文本
  • =?charset?encoding?encoded-text?=

我认为它会变成这样:

ENC = (=\?)([A-Za-z0-9-]*)(\?)([A-Za-z0-9-]*)(?)([Any Character])(\?=)
Part = any character that doesnt match to ENC or ENC
4

1 回答 1

0
function decode ($string, $source_enc, $dest_enc)
{
    $parts = preg_split (
        '/=\?([^?]+)\?([^?]+)\?([^?]+)\?=/', 
        $string, 
        -1, PREG_SPLIT_DELIM_CAPTURE);

    $result = "";

    for ($i = 0; $i < count ($parts); $i++)
    {
        $part = $parts [$i];

        if ($i % 4 == 0)
            $result .= iconv ($source_enc, $dest_enc, $part);
        else
        {
            $charset = $parts [$i++];
            $encoding = $parts [$i++];
            $text = $parts [$i];

            if ($encoding == 'Q' || $encoding == 'q')
                $text = quoted_printable_decode ($text);
            else if ($encoding == 'B' || $encoding == 'b')
                $text = base64_decode ($text);

            $result .= iconv ($charset, $dest_enc, $text);
        }
    }

    return $result;
}

echo (decode ("=?utf-8?Q?encodedtext?= =?iso-8859-1?q?this=20is=20some=20text?=
=?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
    =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=", 
    "ISO-8859-1", "ISO-8859-1"));

我的输出是:

encodedtext this is some text If you can read this yo u understand the example.
于 2013-03-04T04:48:58.173 回答