python - 在 BeautifulSoup 4 中解析一个类

Question

基本上我想访问 html 表中的元素。

这是我的代码：

r = requests.get('http://www.google.com/finance?q=NYSE%3Aibm&ei=Hz4oVZq-PISjiQKYu4GoAQ')

soup = BeautifulSoup(r.content)

td = soup.find_all('td', class_='ctsymbol')

我一无所有...[]

我在同一个 td 上尝试了这种方法，但这次是在本地文本文件上，这似乎工作正常。我究竟做错了什么？

score 0 · Accepted Answer

页面中根本没有这样的元素：

>>> import requests
>>> from bs4 import BeautifulSoup
>>> r = requests.get('http://www.google.com/finance?q=NYSE%3Aibm&ei=Hz4oVZq-PISjiQKYu4GoAQ')
>>> soup = BeautifulSoup(r.content)
>>> {c for e in soup.find_all('td') if 'class' in e.attrs for c in e['class']}
set(['name', 'val', 'p', 'i', 'period', 'itxt', 'lft', 't', 'key', 'colHeader', 'linkbtn'])

<td>这是在提供的 HTML 中的元素上使用的所有类的集合。考虑到您不能依赖在浏览器开发工具中找到的元素树，因为它们反映了JavaScript 代码运行后的页面。

python - 在 BeautifulSoup 4 中解析一个类

1 回答 1

Related

Reference