1

我有替换功能的问题。我可以打印链接,但无法替换链接。我忘记了哪个代码?

import urllib2
import re

htmlfile = urllib2.urlopen('http://sample.html')
htmltext = htmlfile.read()
regex = "'nav_a'>(.+?)</a></li>"
pattern = re.compile(regex)
link = re.findall(pattern,htmltext)

downloadlink = link.replace("*text to replace*", "*replace with*")

print (downloadlink)
4

1 回答 1

0

如果您查看 的文档re.findall(),它将返回字符串中所有非重叠匹配项的列表。因此你不能这样做,link.replace()因为link它是一个列表,而不是字符串。

您必须遍历每个元素link并进行替换。例如,

links = re.findall(pattern,htmltext)
downloadlinks = []

for link in links:
    downloadlinks.append(link.replace("*text to replace*", "*replace with*"))

print(downloadlinks)

编辑(将列表转换为 str):

links = re.findall(pattern,htmltext)
downloadlinks = ''

for i, link in enumerate(links):
    if i == 0:
        downloadlinks += link
    else:
        downloadlinks += ' - ' + link

print(downloadlinks)
于 2013-06-29T23:32:35.710 回答