我从这里下载了源代码。我尝试运行 Toby Segaran 所著的“Programming Collective Intelligence”一书的第 4 章中的示例。我的python版本是2.7.2。我在解释器中输入这段代码:
import searchengine
pages=['http://en.wikipedia.org/wiki/Programming_language']
crawler = searchengine.crawler('searchindex.db')
crawler.crawl(pages)
并得到消息:
Could not open http://en.wikipedia.org/wiki/Programming_language
或者有时会收到消息:
Indexing http://en.wikipedia.org/wiki/Programming_language
Could not parse page http://en.wikipedia.org/wiki/Programming_language
总之,爬虫不会索引页面。我究竟做错了什么?