我有这个链接:
http://www.brothersoft.com/windows/categories.html
我正在尝试获取 div 中项目的链接。例子:
http://www.brothersoft.com/windows/mp3_audio/midi_tools/
我试过这段代码:
import urllib
from bs4 import BeautifulSoup
url = 'http://www.brothersoft.com/windows/categories.html'
pageHtml = urllib.urlopen(url).read()
soup = BeautifulSoup(pageHtml)
sAll = [div.find('a') for div in soup.findAll('div', attrs={'class':'brLeft'})]
for i in sAll:
print "http://www.brothersoft.com"+i['href']
但我只得到输出:
http://www.brothersoft.com/windows/mp3_audio/
我怎样才能得到我需要的输出?