我正在尝试登录Campaign Monitor以从与电子邮件活动绩效相关的页面中抓取一些数据。
我尝试访问的页面的“登录保护”URL 如下所示:
https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC
在网络浏览器中访问该页面(在此处尝试)将重定向到登录页面,该页面本身具有如下 URL:
https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76
我从试图解决这个问题中收集到的是我需要打开一个会话,在重定向 URL 上进行身份验证,然后请求我真正想要的 URL(使用经过身份验证的会话)。
这是我用来尝试完成此操作的代码:
payload = {
'username': 'myUsername',
'password': 'myPassword',
}
redURL = 'https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76'
with requests.Session() as s:
p = s.post(redURL, data=payload)
# This prints the "success" message I've pasted below
print p.content
r = s.get('https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC')
# This prints the HTML of the login page again, as if I'm not authenticated
print r.content
这是会话第一次 POST 后的“成功”响应:
{"MultipleAccounts":false,"LoginStatus":"Success","SiteAddress":"https://mycompany.createsend.com","ErrorMessage":"","SessionExpired":false,"Url":"https://mycompany.createsend.com/login?Origin=Marketing\u0026ReturnUrl=%2fcampaigns%2freports%2flists%2f92D2FBCV98B5XF54BVC%3fs%7DS6F87S6DF876SDF76\u0026s=2FBCV98B5XF54BVC","DomainSwitchAddress":"https://mycompany.createsend.com","DomainSwitchAddressQueryString":null,"NeedsDomainSwitch":false}
有人可以帮我解释一下为什么会话中的第二个请求打印登录页面的 HTML 而不是页面的经过身份验证的版本的 HTML(即包含我要查找的数据的页面)?