0

我正在尝试通过 Python 登录 Twitch.tv 网站。尽管提供了所有参数,但它仍然不允许我登录。下面是代码:

import requests
from bs4 import BeautifulSoup
from time import sleep

# #user[login]:volatil3_
# user[password]:thisispassword
#https://secure.twitch.tv/user/login
# <a href="#" class="header_nick button drop" id="user_display_name"> volatil3_ </a>


def connect():
    user = {'Username':'volatil3_','Password':'thisispassword'}
    headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.94 Safari/537.36','Referer':'http://www.twitch.tv/user/login'}

    with requests.Session() as s:
        html = s.get("http://www.twitch.tv/user/login", headers=headers, verify=False, timeout=5)
        soup = BeautifulSoup(html.text)
        tokenTag = soup.find("input", {"name" : "authenticity_token"})
        token = tokenTag["value"].strip()
        #print(html.text)
        print("-----------------------------------------------")
        credentials = {"user[login]":'volatil3_', "user[password]":'thisispassword',"authenticity_token":token,'redirect_on_login':'https://secure.twitch.tv/user/login','embed_form':'false','utf8':'&#x2713;','mp_source_action':'','follow':''}
        print(credentials)
        s.post("https://secure.twitch.tv/user/login", data = credentials, headers=headers, verify=False, timeout=10,allow_redirects=True)
        #html = s.get("http://www.twitch.tv", headers=headers, verify=False, timeout=5)
        soup = BeautifulSoup(html.text)
        logginTag = soup.find("a", {"id" : "user_display_name"})
        print(logginTag)
        if "Log In" in html.text:
            print("cound not log in")
connect()

理想情况下,登录后它应该返回主页并显示登录用户的名称。对我来说,它显示的 html 表明它没有登录。请帮助我

此处给出的用户/密码是真实的,可用于测试

4

2 回答 2

1

我快速浏览了您想要的站点,发现它是非常重的 javascript。在登录发布请求之后,它将遵循重定向,并且在新页面中,大部分内容都是由 Javascript 生成的,使用 request、urllib2、..etc 真的很麻烦......看起来你是仅在第一阶段:登录之后,如果不使用像 PhantomJS、 Selenium这样的 Javascript 引擎,可能真的无法保证大量工作。这是我在 Python 中使用 Selenium 编写的 POC。希望会有所帮助。

要安装硒:

pip install -U selenium 

这是一个使用Selenium的 Python 解决方案。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from bs4 import BeautifulSoup

my_username = "volatil3_"
my_password = "thisispassword"

driver = webdriver.Firefox()
driver.get("http://www.twitch.tv/user/login")
elem_user = driver.find_element_by_id("login_user_login")
elem_passwd = driver.find_element_by_id("user[password]")
elem_user.send_keys(my_username)
elem_passwd.send_keys(my_password + Keys.RETURN)
# In case it need some time to populate the content.
#time.sleep(5)

html = driver.page_source
soup = BeautifulSoup(html)
logginTag = soup.find("a", {"id" : "user_display_name"})
print(logginTag)
driver.close()

这是输出:

<a class="header_nick button drop" href="#" id="user_display_name">volatil3_</a>
于 2014-09-07T20:42:06.337 回答
0

用于 Twitch 登录的 PhantomJS,请在此处查看我的问题

var page = require('webpage').create();

page.open('http://www.twitch.tv/login', function() {
    page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
        page.evaluate(function() {
            $("#login_user_login").val("username");
            $("[id='user[password]']").val("password");
            $(".button.primary:first").click(); // click login button
        });
        setTimeout(function(){
            page.render("e.png"); // see if anything happens
            phantom.exit();
        }, 5000); // 5 seconds
    });
});
于 2015-03-02T00:24:27.883 回答