-1

我想从 html 页面中删除一个标签(具有特定的 id)。例如:

<div id="id1" >
      "Contents here"
</div>

<div id="id2"> ...</div>

如果我想删除第一个标签,而不是第二个标签,那我该怎么做?

4

1 回答 1

3

使用BeautifulSoup

In [32]: from BeautifulSoup import BeautifulSoup

In [33]: doc = '''<div id="id1" >
      "Contents here"
</div>
<div id="id2"> ...</div>'''

In [34]: soup = BeautifulSoup(doc)

In [35]: id1 = soup.find('div', id='id1')

In [36]: print soup
<div id="id1">
      "Contents here"
</div>
<div id="id2"> ...</div>

In [37]: id1.extract()
Out[37]: 
<div id="id1">
      "Contents here"
</div>

In [38]: print soup

<div id="id2"> ...</div>
于 2012-12-16T09:57:45.270 回答