在下面,我试图获取网站http://www.searspartsdirect.com的所有超链接,但我得到的输出是,我在这里做错了什么
<html>
<body onload="document.acsForm.submit();">
<form name="acsForm" action="https://www.searspartsdirect.com/partsdirect/j_acegi_cas_security_check?ssonofail=true" method="post">
<div style="display: none">
<textarea rows=10 cols=80 name="logonPassword"></textarea>
<textarea rows=10 cols=80 name="loginId"></textarea>
<textarea rows=10 cols=80 name="screenName"></textarea>
<textarea rows=10 cols=80 name="errorCode"></textarea>
</div>
</form>
</body>
</html>
这是我的脚本:
import httplib2
import sys
from bs4 import BeautifulSoup , SoupStrainer
import urllib , urllib2 , cookielib , random ,datetime,time,sys
sitename=sys.argv[1]
http = httplib2.Http()
status, response = http.request(sitename)
cookiejar = cookielib.CookieJar()
urlOpener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookiejar))
urllib2.install_opener(urlOpener)
request = urllib2.Request(sitename)
url = urlOpener.open(request)
contents = url.read()
soup = BeautifulSoup(contents)
for a in soup.findAll('a'):
print a