在 Firefox 中浏览时,我可以看到http://www.chicagotribune.com/ct-florida-school-shooter-nikolas-cruz-20180217-story.html 。但是,newspaper3k
给了我这个错误:
Article download() failed with HTTPSConnectionPool(host='www.chicagotribune.com', port=443): Read timed out. (read timeout=7) on URL http://www.chicagotribune.com/ct-florida-school-shooter-nikolas-cruz-20180217-story.html
我的代码是:
from newspaper import Article
from newspaper import Config
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Firefox/78.0'
config = Config()
config.browser_user_agent = user_agent
url = "https://www.chicagotribune.com/nation-world/ct-florida-school-shooter-nikolas-cruz-20180217-story.html"
page = Article(url, config=config)
page.download()
page.parse()
print(page.text)
我认为像“renewIPAddress()”这样的东西可能会有所帮助,但我不确定如何将它准确地放入这段代码中。https://stackoverflow.com/a/50496768/2414957