python - 带有'if'语句的bs4列表理解

Question

如果表格行包含文本，我正在尝试将每个表格行的文本添加到我的列表中。我想使用列表理解来做到这一点。

这是我尝试过的

 listt2 = [s.span.text for s in soup.find_all('tr') if s.span.text]

这是错误

    listt2 = [s.span.text for s in soup.find_all('tr') if s.span.text]
AttributeError: 'NoneType' object has no attribute 'text'

这是 1 个包含 span 标签的“tr”：

<tr>
    <td colspan="2" class="cell--section-end cell--link cell--link__icon">
        <a data-analytics="[Competitions] - German Bundesliga" href="/football/german-bundesliga/event/26301018" class="cell--link__link  cell-text">
            <i class="i accordion__title-icon--green accordion__title-icon--right" data-char="&quot;></i>            <b class="cell-text__line cell-text__line--icon">
                <span class="competitions-team-name  js-ev-desc">1. FC Köln v 1899 Hoffenheim</span>
            </b>

                    </a>
    </td>

<tr>

这是另一个没有的：

<tr>
        <td colspan="5" class="group-header">
            Sat 14:30        </td>
    </tr>

请注意，此页面上还有更多 tr 标签

score 1 · Accepted Answer

如果您只想获取<tr>包含<span>标签的标签，您可以使用此列表推导：

listt2 = [s.span.text for s in soup.select('tr:has(span)') if s.span.text]

编辑：

from bs4 import BeautifulSoup

html_doc = '''<tr>
    <td colspan="2" class="cell--section-end cell--link cell--link__icon">
        <a data-analytics="[Competitions] - German Bundesliga" href="/football/german-bundesliga/event/26301018" class="cell--link__link  cell-text">
            <i class="i accordion__title-icon--green accordion__title-icon--right" data-char="&quot;></i>            <b class="cell-text__line cell-text__line--icon">
                <span class="competitions-team-name  js-ev-desc">1. FC Köln v 1899 Hoffenheim</span>
            </b>

                    </a>
    </td>

<tr>'''

soup = BeautifulSoup(html_doc, 'html.parser')

listt2 = [s.span.text for s in soup.select('tr:has(span)') if s.span.text]

print(listt2)

印刷：

['1. FC Köln v 1899 Hoffenheim']

score 1 · Accepted Answer

您只需要span is not None在查找span.text.

listt2 = [s.span.text for s in soup.find_all('tr') if s.span is not None and s.span.text]

由于短路，s.span.text如果s.span is None因为False and *是，则永远不会评估False

python - 带有'if'语句的bs4列表理解

2 回答 2

Related

Reference