3

如何强制 urllib2/requests 模块使用相对路径而不是完整/绝对 URL?

当我使用 urllib2/requests 发送请求时,我在代理中看到它将其解析为:

GET https://xxxx/path/to/something HTTP/1.1

不幸的是,我发送它的服务器无法理解该请求并给了我奇怪的 302。我知道它在 RFC 中,它只是不起作用,我正在尝试用 python 代码修复它。我无权访问该服务器。

相对路径,效果很好

GET /path/to/something HTTP/1.1
Host: xxxx

那么如何强制 requests/urllib2 不使用绝对路径呢?并使用简单的相对路径?

4

2 回答 2

13

以下代码可能适合您的情况:

from urlparse import urljoin
import requests

class RelativeSession(requests.Session):
    def __init__(self, base_url):
        super(RelativeSession, self).__init__()
        self.__base_url = base_url

    def request(self, method, url, **kwargs): 
        url = urljoin(self.__base_url, url)
        return super(RelativeSession, self).request(method, url, **kwargs)

session = RelativeSession('http://server.net')
response = session.get('/rel/url')
于 2015-02-05T06:51:24.683 回答
2

I think there is a little bit of confusion here. As per RFC 2616 only absolute path or absolute URI are allowed in http request line. There is simply no such thing as relative http request -- as basicly http is stateless.

In your question you talking about proxy, that RFC state clearly that:

The absoluteURI form is REQUIRED when the request is being made to a proxy.

As per se, AFAIK, your proxy is not HTTP/1.1 compliant. Is this a commercial product or an in-house development?

By the way, HTTP 302 is a redirect. Are you sure the ressource hasn't simply moved to an other location?


Anyway, by looking at the source code or requests (requests/models.py L276) I'm afraid it doesn't seem to have any easy way to force the use of absolute path

My best bet would be to change the PreparedRequest object before it is send as described in advanced usages/Prepared request.

于 2013-06-11T14:38:34.907 回答