0

嗨,这是我的汤对象:

<td class="kategorie">
 <div data-navi-cat="c5ff5b1d0dc93c">
  Herren
 </div>
 <div data-navi-cat="c5ff5b1d0dc95f">
  Frauen
 </div>
 <div data-navi-cat="c5ff5b1d0dc978">
  A-Jugend (U19)
 </div>
 <div data-navi-cat="c5ff5b1d0dc98c">
  B-Jugend (U17)
 </div>
 <div data-navi-cat="c5ff5b1d0dc9a2">
  C-Jugend (U15)
 </div>
 <div data-navi-cat="c5ff5b1d0dc9b1">
  U17-Juniorinnen
 </div>
 <div data-navi-cat="c5ff5b1d0dc9b6">
  Futsal
 </div>
 <div data-navi-cat="c5ff5b1d0dc9bd">
  eSport
 </div>
</td>

如何从对象中获取所有 c 代码及其相应的文本?例如:c-code:“c5ff5b1d0dc93c”及其对应的文本:第一行的“Herren”...

我的代码如下所示(类别是汤对象):

for category in categories.find_all('div'):
    category = categories.find('div')
    print(category)

我只收到第一行的信息....

<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>
<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>

4

1 回答 1

1

怎么了?

  • categories保存你的 html
  • 在你的循环中category = categories.find('div')-find('div')总是返回第一次出现,所以category总是<div data-navi-cat="c5ff5b1d0dc93c">Herren</div>

您应该category = element.get_text()获取文本并code = element.get('data-navi-cat')获取代码。

例子

from bs4 import BeautifulSoup
html = '''<td class="kategorie">
 <div data-navi-cat="c5ff5b1d0dc93c">
  Herren
 </div>
 <div data-navi-cat="c5ff5b1d0dc95f">
  Frauen
 </div>
 <div data-navi-cat="c5ff5b1d0dc978">
  A-Jugend (U19)
 </div>
 <div data-navi-cat="c5ff5b1d0dc98c">
  B-Jugend (U17)
 </div>
 <div data-navi-cat="c5ff5b1d0dc9a2">
  C-Jugend (U15)
 </div>
 <div data-navi-cat="c5ff5b1d0dc9b1">
  U17-Juniorinnen
 </div>
 <div data-navi-cat="c5ff5b1d0dc9b6">
  Futsal
 </div>
 <div data-navi-cat="c5ff5b1d0dc9bd">
  eSport
 </div>
</td>'''

soup = BeautifulSoup(html, "lxml")
for element in soup.find_all('div'):
    category = element.get_text()
    code = element.get('data-navi-cat')
    print(category, code)

输出

  Herren
  c5ff5b1d0dc93c

  Frauen
  c5ff5b1d0dc95f

  A-Jugend (U19)
  c5ff5b1d0dc978

  B-Jugend (U17)
  c5ff5b1d0dc98c
于 2021-01-06T15:39:33.900 回答