97

我该怎么做?我试图输入一些指定的链接(使用 urllib),但要做到这一点,我需要登录。

我有这个网站的来源:

<form id="login-form" action="auth/login" method="post">
    <div>
    <!--label for="rememberme">Remember me</label><input type="checkbox" class="remember" checked="checked" name="remember me" /-->
    <label for="email" id="email-label" class="no-js">Email</label>
    <input id="email-email" type="text" name="handle" value="" autocomplete="off" />
    <label for="combination" id="combo-label" class="no-js">Combination</label>
    <input id="password-clear" type="text" value="Combination" autocomplete="off" />
    <input id="password-password" type="password" name="password" value="" autocomplete="off" />
    <input id="sumbitLogin" class="signin" type="submit" value="Sign In" />

这可能吗?

4

7 回答 7

76
于 2010-05-26T05:38:59.470 回答
56

Let me try to make it simple, suppose URL of the site is www.example.com and you need to sign up by filling username and password, so we go to the login page say http://www.example.com/login.php now and view it's source code and search for the action URL it will be in form tag something like

 <form name="loginform" method="post" action="userinfo.php">

now take userinfo.php to make absolute URL which will be 'http://example.com/userinfo.php', now run a simple python script

import requests
url = 'http://example.com/userinfo.php'
values = {'username': 'user',
          'password': 'pass'}

r = requests.post(url, data=values)
print r.content

I Hope that this helps someone somewhere someday.

于 2015-02-20T12:01:34.547 回答
30

Typically you'll need cookies to log into a site, which means cookielib, urllib and urllib2. Here's a class which I wrote back when I was playing Facebook web games:

import cookielib
import urllib
import urllib2

# set these to whatever your fb account is
fb_username = "your@facebook.login"
fb_password = "secretpassword"

class WebGamePlayer(object):

    def __init__(self, login, password):
        """ Start up... """
        self.login = login
        self.password = password

        self.cj = cookielib.CookieJar()
        self.opener = urllib2.build_opener(
            urllib2.HTTPRedirectHandler(),
            urllib2.HTTPHandler(debuglevel=0),
            urllib2.HTTPSHandler(debuglevel=0),
            urllib2.HTTPCookieProcessor(self.cj)
        )
        self.opener.addheaders = [
            ('User-agent', ('Mozilla/4.0 (compatible; MSIE 6.0; '
                           'Windows NT 5.2; .NET CLR 1.1.4322)'))
        ]

        # need this twice - once to set cookies, once to log in...
        self.loginToFacebook()
        self.loginToFacebook()

    def loginToFacebook(self):
        """
        Handle login. This should populate our cookie jar.
        """
        login_data = urllib.urlencode({
            'email' : self.login,
            'pass' : self.password,
        })
        response = self.opener.open("https://login.facebook.com/login.php", login_data)
        return ''.join(response.readlines())

You won't necessarily need the HTTPS or Redirect handlers, but they don't hurt, and it makes the opener much more robust. You also might not need cookies, but it's hard to tell just from the form that you've posted. I suspect that you might, purely from the 'Remember me' input that's been commented out.

于 2010-05-26T06:19:05.533 回答
21

Web page automation ? Definitely "webbot"

webbot even works web pages which have dynamically changing id and classnames and has more methods and features than selenium or mechanize.

Here's a snippet :)

from webbot import Browser 
web = Browser()
web.go_to('google.com') 
web.click('Sign in')
web.type('mymail@gmail.com' , into='Email')
web.click('NEXT' , tag='span')
web.type('mypassword' , into='Password' , id='passwordFieldId') # specific selection
web.click('NEXT' , tag='span') # you are logged in ^_^

The docs are also pretty straight forward and simple to use : https://webbot.readthedocs.io

于 2018-07-04T09:22:32.967 回答
19
import cookielib
import urllib
import urllib2

url = 'http://www.someserver.com/auth/login'
values = {'email-email' : 'john@example.com',
          'password-clear' : 'Combination',
          'password-password' : 'mypassword' }

data = urllib.urlencode(values)
cookies = cookielib.CookieJar()

opener = urllib2.build_opener(
    urllib2.HTTPRedirectHandler(),
    urllib2.HTTPHandler(debuglevel=0),
    urllib2.HTTPSHandler(debuglevel=0),
    urllib2.HTTPCookieProcessor(cookies))

response = opener.open(url, data)
the_page = response.read()
http_headers = response.info()
# The login cookies should be contained in the cookies variable

For more information visit: https://docs.python.org/2/library/urllib2.html

于 2010-05-26T06:18:24.723 回答
8

一般来说,网站可以通过许多不同的方式检查授权,但您所针对的网站似乎使您相当容易。

您所需要的只是向POSTURLauth/login提供一个表单编码的 blob,其中包含您在那里看到的各种字段(忘记标签for,它们是人类访问者的装饰)。 handle=whatever&password-clear=pwd依此类推,只要您知道句柄(AKA 电子邮件)和密码的值就可以了。

大概 POST 会将您重定向到一些“您已成功登录”页面,其中包含Set-Cookie验证您的会话的标题(请务必保存该 cookie 并将其发送回会话中的进一步交互!)。

于 2010-05-26T05:27:43.297 回答
5

For HTTP things, the current choice should be: Requests- HTTP for Humans

于 2013-12-15T02:53:09.873 回答