我已经完全按照教程进行了操作,我希望我的刮刀能够刮掉所有指向包含每个警察局信息的特定页面的链接,但它几乎会返回整个网站。
from urllib import urlopen
import re
f = urlopen("http://www.emergencyassistanceuk.co.uk/list-of-uk-police-stations.html").read()
b = re.compile('<span class="listlink-police"><a href="(.*)">')
a = re.findall(b, f)
listiterator = []
listiterator[:] = range(0,16)
for i in listiterator:
print a
print "\n"
f.close()