亲爱的 Stackoverflow 社区!
这是关于我在此处发布的上一个问题的后续问题。
我想将带有 NewsPaper 库的新闻报纸 URL 从多个来源中提取到一个列表中。这对一个来源很有效,但是一旦我添加了第二个来源链接,它就只提取第二个来源的 URL。
import feedparser as fp
import newspaper
from newspaper import Article
website = {"cnn": {"link": "edition.cnn.com", "rss": "rss.cnn.com/rss/cnn_topstories.rss"}, "cnbc":{"link": "cnbc.com", "rss": "cnbc.com/id/10000664/device/rss/rss.html"}} A
for source, value in website.items():
if 'rss' in value:
d = fp.parse(value['rss'])
#if there is an RSS value for a company, it will be extracted into d
article_list = []
for entry in d.entries:
if hasattr(entry, 'published'):
article = {}
article['link'] = entry.link
article_list.append(article['link'])
print(article['link'])
输出如下,仅附加了来自第二个来源的链接:
['https://www.cnbc.com/2019/10/23/why-china-isnt-cutting-lending-rates-like-the-rest-of-the-world.html', 'https://www.cnbc.com/2019/10/22/stocks-making-the-biggest-moves-after-hours-snap-texas-instruments-chipotle-and-more.html' , ...]
我希望将两个来源的所有 URL 提取到列表中。有谁知道这个问题的解决方案?非常感谢您提前!!