python - BeautifulSoup 发生非类型错误

Question

我正在使用 BeautifulSoup 使用<p>以下代码从 html 数据中的标签中提取文本

for i in data:    
    soup = BeautifulSoup(i, 'html')
    print(' '.join(map(lambda e: e.string, soup.find_all('p'))))

其中 data 是一个列表，其中每个元素都是一个包含 html 代码的字符串。我的问题是它适用于某些示例，但对于其他示例，它给出了

TypeError: sequence item 1: expected string or Unicode, NoneType found

对于上述代码中的第二行。谁能向我解释为什么会发生这种情况。或者另一种方法来检查并跳过会发生此错误的示例？

score 1 · Accepted Answer

尝试获取所有p包含一些文本的标签：

' '.join(el.string for el in soup.find_all('p', text=True))

1 回答 1