1
import urllib2
from BeautifulSoup import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/weather/usa/tucson).read())
for row in soup('table', {'class' : 'rpad'})[0].tbody('tr'):
  tds = row('td')
  print tds[0].string, tds[1].string

执行时收到错误“Nonetype object not callable”

4

2 回答 2

4
import urllib2
from BeautifulSoup import BeautifulSoup

soup = BeautifulSoup(urllib2.urlopen('http://www.timeanddate.com/weather/usa/tucson').read())

>>> print soup('table', {'class' : 'rpad'})[0]
<table class="rpad"><tr><td>Location:</td><td>Davis-Monthan Air Force Base</td></tr><tr><td>Temperature:</td><td>25&nbsp;°C</td></tr><tr><td>Comfort Level:</td><td>26&nbsp;°C</td></tr><tr><td>Dew point:</td><td>21&nbsp;°C</td></tr><tr><td>Pressure:</td><td>1009 millibars</td></tr><tr><td>Humidity:</td><td>77%</td></tr><tr><td>Visibility:</td><td>16 km</td></tr><tr><td>Wind:</td><td>11 km/h from 280&deg; West<img src="http://c.tadst.com/gfx/comp/sa8.png" width="14" height="14" alt="Direction East" title="Wind blowing from West to East" /></td></tr><tr><td>Last update:</td><td>Tue 9:55 PM MST</td></tr></table>

>>> 'tbody' in soup('table', {'class' : 'rpad'})[0]
False
>>> print soup('table', {'class' : 'rpad'})[0].tbody
None

None不可调用,即你不能调用None('tr')

于 2012-08-22T06:11:49.753 回答
3

如果您看到您的网址的页面来源

http://www.timeanddate.com/weather/usa/tucson

table具有类的元素rpad没有tbody子元素。

<table class=rpad><tr><td>Location:</td><td>Davis-Monthan Air Force Base</td></tr><tr><td>Temperature:....

您需要记住这个结构来提取数据。直接遍历trandtd元素。

于 2012-08-22T06:14:44.773 回答