python - 在 Python 中使用 BeautifulSoup 查找 html 标签

Question

我想在 html 代码中找到一个特定的标签，比如如果有 2 个标签，那么我如何获取第二个标签的内容，而不是 soup.find(id='contact1') 在这里所做的第一个标签是示例 html代码

<table align="center"><th id="contact">STUDENT ID</th><th id="contact">NAME</th><th id="contact">   Phone </th><th id="contact"> NO.</th>
<p align="center" style="display:compact; font-size:18px; font-family:Arial, Helvetica, sans-serif; color:#CC3300">
</p><tr>
<td id="contact1">
2011XXA4438F </td> <td id="contact1"> SAM SRINIVAS KRISHNAGOPAL</td> <td id="contact1"> 9894398690 </td> <td id="contact1"> </td>
</tr>
</table>

我想要做的是将'2011XXA4438F'提取为字符串我该怎么做？

score 4 · Accepted Answer

<td id="contact1">是第一个 id 为的标签"contact1"。要获得它，soup.find您只需要：

>>> print soup.find(id='contact1').text.strip()
2011XXA4438F

如果您正在寻找其他标签，那么您需要使用find_all：

>>> print soup.find_all(id='contact1')
[<td id="contact1">
2011XXA4438F </td>, <td id="contact1"> SAM SRINIVAS KRISHNAGOPAL</td>, <td id="contact1"> 9894398690 </td>, <td id="contact1"> </td>]

score 1 · Accepted Answer

我很确定 .find 只会为您提供与您的查询匹配的第一个元素。尝试使用 .findAll 代替。

在此处查看文档 - http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html

编辑：误读你的帖子。只是为了彻底了解。你想总是找到“id='contact1'”的第二次出现吗？

可能有更优雅的东西，但你可以做类似的事情

v = soup.find_all(id='contact1')
length = 0
for x in v:
    length += 1
    if length = 2: #set number according to which occurrence you want. 
        #here is the second occurrence of id='contact1'.

以上完全未经测试，直接写在这里。而且我才刚刚开始使用python，有些可能有更有效的方法:-)

score 0 · Accepted Answer

0

你也可以这样做：
target = soup.find("table", {"id": "contact1"})

于 2020-07-05T07:01:47.577 回答

python - 在 Python 中使用 BeautifulSoup 查找 html 标签

3 回答 3

Related

Reference