1
import http.client
import urllib.parse

def unshorten_url(url):
    parsed = urllib.parse.urlparse(url)
    h = http.client.HTTPConnection(parsed.netloc)
    resource = parsed.path
    if parsed.query != "":
        resource += "?" + parsed.query
    h.request('HEAD', resource )
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return unshorten_url(response.getheader('Location')) # changed to process chains of short urls
    else:
        return url

unshorten_url("http://data.europa.eu/esco/occupation/00030d09-2b3a-4efd-87cc-c4ea39d27c34")

输入将是http : //data.europa.eu/esco/occupation/00030d09-2b3a-4efd-87cc-c4ea39d27c34 #yes 返回相同。

我需要的解压缩后的输出 URLhttps ://ec.europa.eu/esco/portal/occupation?uri=http%3A%2F%2Fdata.europa.eu%2Fesco%2Foccupation%2F00030d09-2b3a-4efd-87cc- c4ea39d27c34&conceptLanguage=en&full=true#&uri=http://data.europa.eu/esco/occupation/00030d09-2b3a-4efd-87cc-c4ea39d27c34 '

4

1 回答 1

0

如您所见,我有两个 URL,一个是我的输入短 URL,另一个是完整 URL,为了实现所需的输出 URL,我从一组相同类型的 URL 中识别出一个模式。我编写了这段代码并实现了所需的输出。

my_url = "http://data.europa.eu/esco/occupation/00030d09-2b3a-4efd-87cc-c4ea39d27c34"
a="https://ec.europa.eu/esco/portal/occupationuri=http%3A%2F%2Fdata.europa.eu%2Fesco%2Foccupation%2F"
b = my_url.split("/")[-1]
URL = a+ b+ "&conceptLanguage=en&full=true#&uri=" + my_url

输出即;所需的完整 URL 是 URL。URL = " https://ec.europa.eu/esco/portal/occupation?uri=http%3A%2F%2Fdata.europa.eu%2Fesco%2Foccupation%2F00030d09-2b3a-4efd-87cc-c4ea39d27c34&conceptLanguage=en&full=true #&uri=http://data.europa.eu/esco/occupation/00030d09-2b3a-4efd-87cc-c4ea39d27c34 '"

于 2019-11-28T07:32:51.997 回答