python - BeautifulSoup 中“findAll”和“find_all”的区别

Question

我想用 Python 解析一个 HTML 文件，我使用的模块是 BeautifulSoup。

据说功能find_all同findAll。我已经尝试过这两种方法，但我相信它们是不同的：

import urllib, urllib2, cookielib
from BeautifulSoup import *
site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"

rqstr = urllib2.Request(site)
rq = urllib2.urlopen(rqstr)
fchData = rq.read()

soup = BeautifulSoup(fchData)

t = soup.findAll('tr')

谁能告诉我区别？

score 73 · Accepted Answer

在 BeautifulSoup 版本 4 中，方法完全相同；混合大小写版本（findAll、findAllNext、nextSibling等）已全部重命名以符合Python 样式指南，但旧名称仍可使用以使移植更容易。有关完整列表，请参阅方法名称。

在新代码中，您应该使用小写版本，sofind_all等。

但是，在您的示例中，您使用的是 BeautifulSoup版本 3（自 2012 年 3 月起停产，如果您能提供帮助，请不要使用它），只有在哪里findAll()可用。未知属性名称（例如.find_all，仅在 BeautifulSoup 4 中可用）被视为您正在搜索该名称的标签。<find_all>您的文档中没有标签，因此None为此返回。

score 11 · Accepted Answer

来自 BeautifulSoup 的源代码：

http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/bs4/element.py#L1260

def find_all(self, name=None, attrs={}, recursive=True, text=None,
                 limit=None, **kwargs):
# ...
# ...

findAll = find_all       # BS3
findChildren = find_all  # BS2

python - BeautifulSoup 中“findAll”和“find_all”的区别

2 回答 2

Related

Reference