Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
我正在写一个蜘蛛,我想知道哪个链接是“下一页”,所以我需要通过value =“下一页”获取元素,然后获取链接。它不仅包含一个标签,它是一个完整的html源代码,我想获得具体的链接。
如果我想得到一个元素
`<a href="http://*****">..</a>`
我可以用
`'a[href^="http"]'`
我尝试
`'a[text="value"]'`
尝试“包含”:
from pyquery import PyQuery as pq doc = pq("<html><body><a href='https://stackoverflow.com'>Next page</a><p>...Next time...</p></body></html>") el = doc('a:Contains("Next")') el.text() # 'Next page' el.attr['href'] # 'https://stackoverflow.com'