1

使 Mechanize 跟随br.follow_link位于某个 div 内的链接()的最 Pythonic 方式是什么?我知道如何在 BeautifulSoup 的帮助下做到这一点,但是有没有办法用 Mechanize 做到这一点?

示例 div:

<div id="blah_links">
 <a href="LINK1" class="active">1</a> |
 <a href="LINK2">2</a> |
 <a href="LINK3">3</a> |
 <a href="LINK4">NEXT</a>
</div>
4

1 回答 1

1

我最近遇到了类似的问题,这就是我所做的

url = "www.somewhere.com"
br = mechanize.Browser()
br.open(url)

encoded_data = UnicodeDammit(br.response().read(),isHTML=True).unicode
parser = lxml_html.fromstring(encoded_data)

soup_xpath = "//div[@id='BODYCON']//a/@href"
valid_links = soup.xpath(soup_xpath)
links  = [ link for link if link.url in valid_links ] 
于 2012-07-27T14:29:42.997 回答