python - 在 Python 中设置特定的正则表达式

Question

我是 Python 新手，我需要一个正则表达式来检索此格式的标题和链接：

<a href="anything" class="anything" title="Size: anything">anything</a>

score 4 · Accepted Answer

使用一个像样的 HTML 解析器会好得多。使用具有大量文档的 BeautifulSoup - 例如：

from bs4 import BeautifulSoup

soup = BeautifulSoup(input)

for link in soup.find_all('a', class_='anything'):
    print link['href'], link.text

这会找到所有<a>具有 class 的元素anything，然后打印它们的 URL 和链接文本。

正则表达式通常不是解析 HTML 的工具。

1 回答 1