0

我目前得到这样的输出:

http://www.site.com/prof.php?pID=478http://www.site.com/prof.php?pID=693

在使用下面评论者的建议后,我有:

urls = [el.url for el in domainLinkOutput]
return HttpResponse(urls)

如何将此输出转换为 python 字典,例如:

urls = { '0': 'http://www.site.com/prof.php?pID=478', '1': 'http://www.site.com/prof.php?pID=693' }
4

3 回答 3

1

I don't believe you need regex here - just use attribute access on the Link objects you have...

If you have a list of Link objects, then use something like:

urls = [el.url for el in list_of_objects]

You should just be able to get the url by Link_object.url...

于 2013-08-09T16:26:41.853 回答
1

使用此正则表达式匹配 url:

url='([^']+)'

样本输出:

    [0] => http://www.somesite.com/prof.php?pID=478
    [1] => http://www.somesite.com/prof.php?pID=527
    [2] => http://www.somesite.com/prof.php?pID=645

如果要排除参数,请使用

url='([^'?]+)

样本输出:

    [0] => http://www.somesite.com/prof.php
    [1] => http://www.somesite.com/prof.php
    [2] => http://www.somesite.com/prof.php
于 2013-08-09T16:30:57.517 回答
0

你可以试试re.finditer

r = re.compile("url='(.*?)'")
for match in r.finditer(input):
    print match.group[1]

您可以在此处阅读 Python 文档。

于 2013-08-09T16:40:13.750 回答