我正在学习 Beautiful Soup,并试图从http://www.popsci.com页面中提取所有链接......但我遇到了语法错误。
这段代码应该可以工作,但它不适用于我尝试过的任何页面。我试图找出为什么它不工作。
这是我的代码:
from BeautifulSoup import BeautifulSoup
import urllib2
url="http://www.popsci.com/"
page=urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
sci=soup.findAll('a')
for eachsci in sci:
print eachsci['href']+","+eachsci.string
...这是我得到的错误:
Traceback (most recent call last):
File "/root/Desktop/3.py", line 12, in <module>
print eachsci['href']+","+eachsci.string
TypeError: coercing to Unicode: need string or buffer, NoneType found
[Finished in 1.3s with exit code 1]