我整天都在寻找解决这个问题。有这个http://www.some.site/index.php正在请求用户和密码 + 发送 cookie。好吧,我是这样进入的:
import urllib, urllib2, cookielib, os
import re # not required here but tried it out though
import requests # not required here but tried it out though
username = 'somebody'
password = 'somepass'
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
login_data = urllib.urlencode({'username' : username, 'j_password' : password})
resp = opener.open('http://www.some.site/index.php', login_data)
print resp.read()
问题是屏幕中间有一个下载 .xls 文件的链接:http ://www.some.site/excel_file.php?/t=1303457489 。我可以在任何浏览器(Mozilla、Chrome、IE)中下载该文件,但不能使用 Python。在 .php 之后,帖子数据(即: ?t=1370919996 )在我登录或刷新页面时一直在变化。
也许我错了,但我相信 Post Data 是从 cookie(或 session-cookie)生成的,但 cookie 仅包含以下内容:('set-cookie', 'PHPSESSID=9cde55534fcc8e136fcf6588c0d0f1df; path=/')
这是我尝试保存文件的一种方法:
print "downloading with urllib2"
f = urllib2.urlopen('http://www.some.site/excel_file.php')
data = f.read()
with open("exceldoc.xls", "wb") as code:
code.write(data)
如果我保存它或打印它会给出相同的错误请求错误:
<b>Fatal error</b>: Call to a member function FetchRow() on a non-object in <b>http://www.some.site/excel_file.php</b> on line <b>112</b><br
如何使用 Python 下载此文件?非常感谢您的帮助!
有很多类似的帖子,我已经检查过它们,我的例子是从这些帖子中得到启发的,但对我没有任何帮助。我对cookies、php、js不是很熟悉。
编辑:这是我打印出 index.php 的内容时得到的:
<html>
<head>
<title>SOMETITLE</title>
<meta http-equiv="Page-Enter" content="blendTrans(Duration=0.5)">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link rel='stylesheet' type='text/css' href='somesite.css'>
<SCRIPT LANGUAGE="JavaScript">
<!-- JavaScript hiding
function clearDefault(obj) {
if (!obj._cleared) {
obj.value='';
obj._cleared=true;
}
}
// -->
</SCRIPT>
</head>
<body bgcolor="#FFFFFF" text="#000000">
<table width="100%" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<td>
<table width="1000" height="150" border="0" align="center" cellpadding="16" cellspacing="0" class="header" style="background: #989896 url('images/header.png') no-repeat;">
<tr>
<td valign="middle">
<table width="100%" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<td width="380"> </td>
<td>
<div id="login">
<form name="flogin" method="post" action="/index.php">
<h1>Login</h1>
<input name="uName" type="text" value="Username:" class="name" onfocus="clearDefault(this)">
<br>
<input type="password" name="uPw" value="Password:" class="pass" onfocus="clearDefault(this)">
<input type="submit" name="Submit" value="OK" class="submit">
</form>
</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>