python - 如何在不声明 url 方案的情况下使用 urllib2 获取 url？

Question

这可能是一个愚蠢的问题，但我可以在不声明 http 或 https 之类的 url 方案的情况下使用 urllib2 获取 url

为了澄清而不是写' http://blahblah.com '，我只想写'blahblah.com'，这可能吗？

score 0 · Accepted Answer

import urllib2

def open_url_with_default_protocol(*args, **kwargs):
    #  Use the HTTP scheme by default if none is given
    #  pass through all other arguments to urllib2.urlopen

    default_scheme = 'http://'

    url = args[0]
    scheme, address = urllib2.splittype(url)

    if not scheme:
        #  Replace the url in the args tuple by a URL with the default scheme
        args = (default_scheme + args[0],) + args[1:]

    return urllib2.urlopen(*args, **kwargs)

所以你可以这样做：

>>> open_url_with_default_protocol('http://google.com')
<addinfourl at 4496800872 whose fp = <socket._fileobject object at 0x10bd92b50>>
>>> open_url_with_default_protocol('google.com')
<addinfourl at 4331750464 whose fp = <socket._fileobject object at 0x1027960d0>>

请注意，如果您向它传递一个格式为“//google.com”的 URL，此函数仍然会失败，因为它假定如果没有方案，则没有前导双正斜杠。

python - 如何在不声明 url 方案的情况下使用 urllib2 获取 url？

1 回答 1

Related

Reference