谁能帮我用漂亮的汤遍历一棵 html 树?
我正在尝试通过 html 输出进行解析,并在收集每个值之后插入到一个名为Tld
python/django的表中
<div class="rc" data-hveid="53">
<h3 class="r">
<a href="https://billing.anapp.com/" onmousedown="return rwt(this,'','','','2','AFQjCNGqpb38ftdxRdYvKwOsUv5EOJAlpQ','m3fly0i1VLOK9NJkV55hAQ','0CDYQFjAB','','',event)">Billing: Portal Home</a>
</h3>
并且只解析 的href
属性值<a>
,所以只有这部分:
https://billing.anapp.com/
的:
<a href="https://billing.anapp.com/" onmousedown="return rwt(this,'','','','2','AFQjCNGqpb38ftdxRdYvKwOsUv5EOJAlpQ','m3fly0i1VLOK9NJkV55hAQ','0CDYQFjAB','','',event)">Billing: Portal Home</a>
我目前有:
for url in urls:
mb.open(url)
beautifulSoupObj = BeautifulSoup(mb.response().read())
beautifulSoupObj.find_all('h3',attrs={'class': 'r'})
问题就在上面,离元素find_all
还不够远。<a>
任何帮助深表感谢。谢谢你。