python - 如何通过 BeautifuSoup 读取 html 标签的内容？

Question

<td class="tag">
    <a href="/tag/android"  rel="tag">
         <img src="http://127.0.0.1/idf2.png" >
    android
    </a>          
</td>

编码：

soup = BeautifulSoup(html)
print html.td.a.string   # output None

BeautifulSoup4 中的哪个方法可以检索<a>的内容android

score 0 · Accepted Answer

它是.text，不是.string：

>>>> soup.td.a.text.strip()
u'android'

请注意，我已经stripped 了它，否则它也会包含换行符。

此外，您可能应该考虑一些其他方法来找到a您需要从中提取文本的确切标签，因为a页面上可能有很多标签，这样您只会得到第一个标签。但这当然取决于您用来找到正确标签的标准。

1 回答 1