我正在尝试http://robobrowser.readthedocs.org/en/latest/readme.html,这是一个基于美丽汤和请求库的新 python 库。我目前正在使用它打开一系列页面并将响应保存到列表中以供以后解析。我的调试器中的列表如下所示:
pages = [<Response [200]>, <Response [200]> ....]
我通过让 robobrowser 对象循环遍历某些页面并保存响应来生成此列表:
while pageRight:
browser.follow_link(pageright[0])
browser
page = browser.response
pages.append(page)
pageRight= browser.select(".pageright")
上述部分似乎工作正常,但是当我尝试时:
ag = "myagent"
browser = RoboBrowser(user_agent=ag)
for page in pages:
browser.open(page.content)
for listing in browser.select('.listingInfo'): #a list
pl = getParsedListing(listing)
listings.append(pl)
在我的 django 索引文件中,出现错误:
InvalidSchema at /index/
No connection adapters were found for..
追溯:
Traceback:
File "C:\envs\r1\lib\site-packages\django\core\handlers\base.py" in get_response
114. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "C:\envs\r1\lib\site-packages\django\views\decorators\csrf.py" in wrapped_view
57. return view_func(*args, **kwargs)
File "C:\envs\r1\masslist\ml1\views.py" in index
29. Sites = getSitesInArea(Area)
File "C:\envs\r1\masslist\ml1\views.py" in getSitesInArea
91. browser.open(page.content)
File "C:\envs\r1\lib\site-packages\robobrowser\browser.py" in open
200. verify=verify if verify is not None else self.verify,
File "C:\envs\r1\lib\site-packages\requests\sessions.py" in get
468. return self.request('GET', url, **kwargs)
File "C:\envs\r1\lib\site-packages\requests\sessions.py" in request
456. resp = self.send(prep, **send_kwargs)
File "C:\envs\r1\lib\site-packages\requests\sessions.py" in send
553. adapter = self.get_adapter(url=request.url)
File "C:\envs\r1\lib\site-packages\requests\sessions.py" in get_adapter
608. raise InvalidSchema("No connection adapters were found for '%s'" % url)
Exception Type: InvalidSchema at /index/
Exception Value: No connection adapters were found for '
我究竟做错了什么?