我正在尝试解码电子邮件主题标头。
我正在这样做(正则表达式用于在两个 = 之间添加空格:
header = '=?iso-8859-1?B?TU9UT1IubmwgbmlldXdzYnJpZWYgPiBOaWV1d2UgdmVya29vcHRvcHBl?==?iso-8859-1?B?ciBTdXp1a2kg?='
header = re.sub(r"(==)(?!$)", u"\0= =", header)
email.header.decode_header(header)
但这会引发 HeaderParseError:
HeaderParseError Traceback (most recent call last)
/home/leon/<ipython console> in <module>()
/usr/lib/python2.7/email/header.pyc in decode_header(header)
106 # now we throw the lower level exception away but
107 # when/if we get exception chaining, we'll preserve it.
--> 108 raise HeaderParseError
109 if dec is None:
110 dec = encoded
有趣的是,如果我将 re.sub() 的输出复制到剪贴板并执行以下操作:
email.header.decode_header('=?iso-8859-1?B?TU9UT1IubmwgbmlldXdzYnJpZWYgPiBOaWV1d2UgdmVya29vcHRvcHBl?= =?iso-8859-1?B?ciBTdXp1a2kg?=')
有用!
所以我猜 re.sub() 的编码有问题,但我不知道如何解决这个问题。