1

httplib.HTTPMessage和classes[1]都email.message.Message实现了 RFC822 标头解析的方法。不幸的是,它们有不同的实现[2],并且它们不提供相同级别的功能。

困扰我的一个例子是:

  • httplib.HTTPMessage缺少 中get_filename存在的方法email.Message,该方法使您可以轻松地从Content-disposition: attachment; filename="fghi.xyz"标题中检索文件名;

  • httplib.HTTPMessagehasgetparam和methods 但 AFAIK getplistparseplist它们不是也不能在content-type标头解析之外使用;

  • email.Message有一个通用的get_param方法来解析任何带有参数的 RFC822 标头,例如content-dispositioncontent-type.

因此,我想要 in 的orget_filename方法get_param,但当然,我不能像标准库中那样修补它...... :-qemail.message.Messagehttplib.HTTPMessagehttplib.HTTPMessage

最后,这里是装饰主题... :-)

我成功地创建了一个monkeypatch_http_message函数来装饰httplib.HTTPMessage我缺少的解析方法:

def monkeypatch_http_message(obj):
    from email import utils
    from email.message import (
        _parseparam,
        _unquotevalue,
    )
    cls = obj.__class__

    # methods **copied** from email.message.Message source code
    def _get_params_preserve(self, failobj, header): ...
    def get_params(self, failobj=None, header='content-type', 
                   unquote=True): ...
    def get_param(self, param, failobj=None, header='content-type', 
                  unquote=True): ...
    def get_filename(self, failobj=None): ...

    # monkeypatching httplib.Message
    cls._get_params_preserve = _get_params_preserve
    cls.get_params = get_params
    cls.get_param = get_param
    cls.get_filename = get_filename

现在我可以这样做:

import mechanize
from some.module import monkeypatch_http_message
browser = mechanize.Browser()

# in that form, browser.retrieve returns a temporary filename 
# and an httplib.HTTPMessage instance
(tmp_filename, headers) = browser.retrieve(someurl) 

# monkeypatch the httplib.HTTPMessage instance
monkeypatch_http_message(headers)

# yeah... my original filename, finally
filename = headers.get_filename()

这里的问题是我从源类中复制了装饰方法代码,我想避免这种情况。

所以,我尝试通过引用源方法来装饰:

def monkeypatch_http_message(obj):
    from email import utils
    from email.message import (
        _parseparam,
        _unquotevalue,
        Message    # XXX added
    )
    cls = obj.__class__

    # monkeypatching httplib.Message
    cls._get_params_preserve = Message._get_params_preserve
    cls.get_params = Message.get_params
    cls.get_param = Message.get_param
    cls.get_filename = Message.get_filename

但这给了我:

Traceback (most recent call last):
  File "client.py", line 224, in <module>
    filename = headers.get_filename()
TypeError: unbound method get_filename() must be called with Message instance as first argument (got nothing instead)

我现在摸不着头脑......如何在不复制源方法的情况下装饰我的班级?

有什么建议么 ?:-)

问候,

乔治·马丁


  1. 在 Python 2.6 中。我不能在生产中使用 2.7 或 3.x。

  2. httplib.HTTPMessage继承自mimetools.Message并且rfc822.Messagewhileemail.Message有自己的实现。

4

2 回答 2

2

在 Python 3.x 中,未绑定的方法消失了,因此在这种情况下您将只获取文件对象,并且您的第二个示例将起作用:

>>> class C():
...   def demo(): pass
... 
>>> C.demo
<function demo at 0x1fed6d8>

在 Python 2.x 中,您可以通过未绑定方法访问底层函数,也可以直接从类字典中检索它(从而绕过将其变为未绑定方法的正常查找过程):

>>> class C():
...   def demo(): pass
... 
>>> C.demo.im_func                  # Retrieve it from the unbound method
<function demo at 0x7f463486d5f0>
>>> C.__dict__["demo"]              # Retrieve it directly from the class dict
<function demo at 0x7f463486d5f0>

后一种方法的好处是与 Python 3.x 前向兼容。

于 2011-02-24T21:50:03.587 回答
1

@ncoghlan:我不能将缩进的代码放在注释中,所以又是这样:

def monkeypatch_http_message(obj):
    import httplib
    assert isinstance(obj, httplib.HTTPMessage)
    cls = obj.__class__

    from email import utils
    from email.message import (_parseparam, _unquotevalue, Message)
    funcnames = ('_get_params_preserve', 'get_params', 'get_param', 'get_filename')
    for funcname in funcnames:
        cls.__dict__[funcname] = Message.__dict__[funcname]

谢谢 !:-)

于 2011-02-25T09:06:32.493 回答