1

我正在尝试从借贷俱乐部下载数据的 url 列中抓取当前的借贷记录状态。例如https://lendingclub.com/browse/loanDetail.action?loan_id=104046830 它需要登录才能提取信息。

我已按照步骤创建登录会话,但似乎无法成功执行登录。结果不包含正确的代码。有人可以帮我确定问题吗?

USERNAME = "username"
PASSWORD = "password"

LOGIN_URL = "https://www.lendingclub.com/auth/login?"

loan_id=96490539

URL = "https://lendingclub.com/browse/loanDetail.action?loan_id=96490539"

def main():
    session_requests = requests.session()

    # Get login csrf token
    result = session_requests.get(LOGIN_URL)
    tree = html.fromstring(result.text)
    authenticity_token = tree.xpath("//meta[@name='csrf-token']/@content")[0]

    # Create payload
    payload = {
        "login_email": USERNAME, 
        "login_password": PASSWORD, 
        "csrf-token": authenticity_token
    }

    # Perform login
    result = session_requests.post(LOGIN_URL, data = payload, headers = dict(referer = LOGIN_URL))

    # Scrape url
    result = session_requests.get(URL, headers = dict(referer = URL))
    return result
4

1 回答 1

0

尽管我的建议看起来很奇怪,但您可以尝试一下。根据 chrome 开发工具,它应该足以让您获得有效的响应。

import requests
from lxml import html

USERNAME = "username"
PASSWORD = "password"

LOGIN_URL = "https://www.lendingclub.com/account/login.action"

def main():

    payload={
    'login_url':'/browse/loanDetail.action?loan_id=96490539',
    'login_email':USERNAME,
    'login_password':PASSWORD,
    'offeredNotListedPromotionFlag':''
    }
    with requests.session() as session:
        session.headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36'}
        result = session.post(LOGIN_URL, data=payload, 
            headers={'Referer':'https://www.lendingclub.com/browse/loanDetail.action?loan_id=96490539','Content-Type': 'application/x-www-form-urlencoded'})

        return result
于 2017-09-13T21:27:49.653 回答