所以我成功地使用了这个python脚本:
import httplib2
from BeautifulSoup import BeautifulSoup, SoupStrainer
http = httplib2.Http()
status, response = http.request('https://conceled:conceled@traveler.pha.phila.gov:8443/servlet/traveler')
for link in BeautifulSoup(response, parseOnlyThese=SoupStrainer('a')):
if link.has_key('href'):
print link['href']
将链接拉出网站。它几乎适用于任何其他网站,但是在尝试上述方法时(我需要工作的那个,我得到了很多错误:)
Traceback (most recent call last):
File "C:\Users\joe\Desktop\PHA\AndroidPhones\androidphonescript2.py", line 5, in <module>
status, response = http.request('https://conceled@traveler.pha.phila.gov:8443/servlet/traveler')
File "C:\Python27\lib\httplib2.py", line 608, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cacheFullPath)
File "C:\Python27\lib\httplib2.py", line 449, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "C:\Python27\lib\httplib2.py", line 427, in _conn_request
conn.connect()
File "C:\Python27\lib\httplib.py", line 1157, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno 11003] getaddrinfo failed