0

我正在尝试使用 Python 2.7.5 和mechanize 库创建一个程序,将我登录到我在bing.com上的 Microsoft 帐户。首先,我创建了这个程序来打印该网页上的表单名称,以便在以后的代码中引用它们。我当前的代码是这样的(对长 URL 感到抱歉):

import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent','Firefox')]

br.open("https://login.live.com/ppsecure/post.srf?wa=wsignin1.0&rpsnv=11&ct=1375231095&rver=6.0.5286.0&wp=MBI&wreply=http:<%2F%2Fwww.bing.com%2FPassport.aspx%3Frequrl%3Dhttp%253a%252f%252fwww.bing.com%252f&lc=1033&id=264960&bk=1375231423")
print(br.title)

forms_printed = 0
for form in br.forms():
    print form
    forms_printed += 1
if forms_printed == 0:
    print "No forms to print!"

尽管当我在 Firefox 中访问网页时,我看到了用户名和密码表单,但当我运行此代码时,结果始终是“没有要打印的表单!” 我在这里使用了 mechanize 错误,还是网站故意阻止我找到这些表格?非常感谢任何提示和/或建议。

4

1 回答 1

0

如果您尝试阅读您收到的 HTML,您将看到该网页需要 javascript。

例子:

import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)
br.addheaders = [('User-agent','Firefox')]

page = br.open("https://login.live.com/ppsecure/post.srf?wa=wsignin1.0&rpsnv=11&ct=1375231095&rver=6.0.5286.0&wp=MBI&wreply=http:<%2F%2Fwww.bing.com%2FPassport.aspx%3Frequrl%3Dhttp%253a%252f%252fwww.bing.com%252f&lc=1033&id=264960&bk=1375231423")
print page.read()
print(br.title)

forms_printed = 0
for form in br.forms():
    print form
    forms_printed += 1
if forms_printed == 0:
    print "No forms to print!"

输出:

Microsoft account

JavaScript required to sign in
Microsoft account requires JavaScript to sign in. This web browser either does not support JavaScript, or scripts are being blocked.

To find out whether your browser supports JavaScript, or to allow scripts, see the browser's online help.

查看相关问题

于 2013-08-04T09:52:44.870 回答