1

如果表格行包含文本,我正在尝试将每个表格行的文本添加到我的列表中。我想使用列表理解来做到这一点。

这是我尝试过的

 listt2 = [s.span.text for s in soup.find_all('tr') if s.span.text]

这是错误

    listt2 = [s.span.text for s in soup.find_all('tr') if s.span.text]
AttributeError: 'NoneType' object has no attribute 'text'

这是 1 个包含 span 标签的“tr”:

<tr>
    <td colspan="2" class="cell--section-end cell--link cell--link__icon">
        <a data-analytics="[Competitions] - German Bundesliga" href="/football/german-bundesliga/event/26301018" class="cell--link__link  cell-text">
            <i class="i accordion__title-icon--green accordion__title-icon--right" data-char="&quot;></i>            <b class="cell-text__line cell-text__line--icon">
                <span class="competitions-team-name  js-ev-desc">1. FC Köln v 1899 Hoffenheim</span>
            </b>

                    </a>
    </td>

<tr>                      

这是另一个没有的:

<tr>
        <td colspan="5" class="group-header">
            Sat 14:30        </td>
    </tr>

请注意,此页面上还有更多 tr 标签

4

2 回答 2

1

如果您只想获取<tr>包含<span>标签的标签,您可以使用此列表推导:

listt2 = [s.span.text for s in soup.select('tr:has(span)') if s.span.text]

编辑:

from bs4 import BeautifulSoup

html_doc = '''<tr>
    <td colspan="2" class="cell--section-end cell--link cell--link__icon">
        <a data-analytics="[Competitions] - German Bundesliga" href="/football/german-bundesliga/event/26301018" class="cell--link__link  cell-text">
            <i class="i accordion__title-icon--green accordion__title-icon--right" data-char="&quot;></i>            <b class="cell-text__line cell-text__line--icon">
                <span class="competitions-team-name  js-ev-desc">1. FC Köln v 1899 Hoffenheim</span>
            </b>

                    </a>
    </td>

<tr>'''

soup = BeautifulSoup(html_doc, 'html.parser')

listt2 = [s.span.text for s in soup.select('tr:has(span)') if s.span.text]

print(listt2)

印刷:

['1. FC Köln v 1899 Hoffenheim']
于 2020-09-17T18:53:10.987 回答
1

您只需要span is not None在查找span.text.

listt2 = [s.span.text for s in soup.find_all('tr') if s.span is not None and s.span.text]

由于短路s.span.text如果s.span is None因为False and *是,则永远不会评估False

于 2020-09-17T19:05:21.420 回答