我需要在需要 cookie 的网页上填写登录表单并获取有关结果页面的一些信息。由于这需要在晚上非常奇怪的时间完成,我想自动化这个过程,因此我使用机械化(欢迎任何其他建议 - 请注意,我必须在学校服务器上运行我的脚本,我不能安装新软件。Mechanize 是纯 python,所以我能够解决这个问题)。
问题是托管登录表单的页面要求我能够接受和发送 cookie。理想情况下,我希望能够接受和发送服务器发送给我的所有 cookie,而不是硬编码我自己的 cookie。
因此,我开始使用 mechanize 编写脚本,但我似乎处理 cookie 错误。由于我在任何地方都找不到有用的文档(如果我是盲人请指出),我在这里问。
import mechanize as mech
br = mech.Browser()
print "No Robots"
br.open("some internal uOttawa website")
br.form['j_username'] = 'my username'
print "Login: ************"
br.form['j_password'] = 'my password'
print "Password: ************"
response = br.submit()
print response.read()
No Robots
Login: ************
Password: ************
<img src="/idp/images/uottawa-logo-dark.png" />
An error occurred while processing your request. Please contact your helpdesk or
user ID office for assistance.
This service requires cookies. Please ensure that they are enabled and try your
going back to your desired resource and trying to login again.
Use of your browser's back button may cause specific errors that can be resolved by
going back to your desired resource and trying to login again.
If you think you were sent here in error,
please contact technical support
如果我在 Chrome 浏览器上禁用 cookie 并尝试相同的操作,这确实是我会得到的页面。
我尝试如下添加一个cookie jar,但没有运气。
br = mech.Browser()
cj = cookielib.LWPCookieJar()
A common mistake is to use mechanize.urlopen(), and the .extract_cookies() and
.add_cookie_header() methods on a cookie object themselves.
If you use mechanize.urlopen() (or OpenerDirector.open()),
the module handles extraction and adding of cookies by itself,
so you should not call .extract_cookies() or .add_cookie_header().
我将不胜感激 - 它令人困惑,而且似乎严重缺乏文档。