python - 无法在标签“a”中找到属性“href”的值，但是当我在标签“table”中使用属性“class”尝试此操作时，它起作用了

Question

import requests
r=requests.get('https://www.crummy.com/software/BeautifulSoup/')
from bs4 import BeautifulSoup as bs
soup=bs(r.text,'html.parser')
links=[x['href'] for x in soup.find_all('a')]
links

错误是：

KeyError                                  
Traceback (most recent call last)
<ipython-input-137-97ef77b6e69a> in <module>
----> 1 links=[x['href'] for x in soup.find_all('a')]
      2 links

<ipython-input-137-97ef77b6e69a> in <listcomp>(.0)
----> 1 links=[x['href'] for x in soup.find_all('a')]
      2 links

~\anaconda3\lib\site-packages\bs4\element.py in __getitem__(self, key)
   1319         """tag[key] returns the value of the 'key' attribute for the Tag,
   1320         and throws an exception if it's not there."""
-> 1321         return self.attrs[key]
   1322 
   1323     def __iter__(self):

KeyError: 'href'

但是，以下代码可以正常工作：

import requests
r=requests.get('https://en.wikipedia.org/wiki/Harvard_University')
from bs4 import BeautifulSoup as bs
soup=bs(r.text,'html.parser')
classes=[table['class'] for table in soup.find_all('table')]
classes

score 0 · Accepted Answer

第一个网站包含以下元素：

<a name="Download">

这个锚点没有href属性（它不是链接，它被用作#Download片段的目标），所以你得到一个错误。

您可以使用选择器将标签过滤为仅具有该href属性的标签。

links=[x['href'] for x in soup.select('a[href]')]

python - 无法在标签“a”中找到属性“href”的值，但是当我在标签“table”中使用属性“class”尝试此操作时，它起作用了

1 回答 1

Related

Reference