我正在使用 Jupyter Notebook 并遇到报纸问题,无法从新闻周刊中删除任何内容。我可以让它在 Goose 上运行,但我想有一个备份,以防 Goose 失败。
我尝试过其他网站,例如 Fox、Yahoo 和 CNN,所有这些都运行良好。所以 NewsWeek 是一个孤立的问题。
from newspaper import Article
url = 'https://www.newsweek.com/mike-huckabee-blasts-cnns-axelrod-
calling-daughter-trump-press-secretary-sarah-sanders-1444184'
article = Article(url)
article.download()
article.html
article.parse()
article.text
Article `download()` failed with 403 Client Error: Forbidden for url:
https://www.newsweek.com/mike-huckabee-blasts-cnns-axelrod-calling-daughter-
trump-press-secretary-sarah-sanders-1444184 on URL
https://www.newsweek.com/mike-huckabee-blasts-cnns-axelrod-calling-daughter-
trump-press-secretary-sarah-sanders-1444184