python - 正则表达式、Python 和文档注释

Question

我在 Python 2.7 中编程，我正在使用 beautifulsoup4 从一系列文档的标签中提取信息。但是，该文档具有以下字符串：

<!-- PJG ITAG l=90 g=1 f=4 -->

我想摆脱它们，但是我不是正则表达式的专家。有人可以帮忙吗？

score 3 · Accepted Answer

首先在 BeautifulSoup 中加载您的 HTML：

from bs4 import BeautifulSoup, Comment
soup = BeautifulSoup(the_html)

然后，删除所有评论：

comments = soup.find_all(text = lambda text:isinstance(text, Comment))
for comment in comments:
    comment.extract()

1 回答 1