我正在使用具有子标签的 HTML 元素,我想“忽略”或删除这些标签,以便文本仍然存在。刚才,如果我尝试使用.string
任何带有标签的元素,我得到的只是None
.
import bs4
soup = bs4.BeautifulSoup("""
<div id="main">
<p>This is a paragraph.</p>
<p>This is a paragraph <span class="test">with a tag</span>.</p>
<p>This is another paragraph.</p>
</div>
""")
main = soup.find(id='main')
for child in main.children:
print child.string
输出:
This is a paragraph.
None
This is another paragraph.
我希望第二行是This is a paragraph with a tag.
. 我该怎么做呢?