python - 使用 python urllib2 获取位置标头的值

Question

当我使用 urllib2 并列出标题时，我看不到“位置”标题。

In [19]:p = urllib2.urlopen('http://www.example.com')


In [21]: p.headers.items()
Out[21]: 
[('transfer-encoding', 'chunked'),
 ('vary', 'Accept-Encoding'),
 ('server', 'Apache/2.2.3 (CentOS)'),
 ('last-modified', 'Wed, 09 Feb 2011 17:13:15 GMT'),
 ('connection', 'close'),
 ('date', 'Fri, 25 May 2012 03:00:02 GMT'),
 ('content-type', 'text/html; charset=UTF-8')]

如果我使用 telnet 和 GET

telnet www.example.com 80
Trying 192.0.43.10...
Connected to www.example.com.
Escape character is '^]'.
GET / HTTP/1.0  
Host:www.example.com

HTTP/1.0 302 Found
Location: http://www.iana.org/domains/example/
Server: BigIP
Connection: close
Content-Length: 0

那么，使用 urllib2 ，我如何获得 'Location' 标头的值？

score 6 · Accepted Answer

这是因为默认情况下 urllib2 遵循位置标头。所以最后的回应不会有一个。如果您突然禁用以下重定向，您可以看到 301 和 302 页面的位置标题。请参阅：如何防止 Python 的 urllib(2) 跟随重定向

从那里借：

class NoRedirection(urllib2.HTTPErrorProcessor):
  def http_response(self, request, response):
    return response
  https_response = http_response

opener = urllib2.build_opener(NoRedirection)
location = opener.open('http://www.example.com').info().getheader('Location')

score 3 · Accepted Answer

使用从geturl返回的类文件对象的方法urlopen：

>>> f = urllib2.urlopen('http://www.example.com')
>>> f.geturl()
'http://www.iana.org/domains/example/'

python - 使用 python urllib2 获取位置标头的值

2 回答 2

Related

Reference