我遇到的问题 - 并尝试使用 Python 解决 - 是为网站(具体而言,在http://demo.travelportuniversalapi.com上的 API 的免费在线演示)发出连续的 POST 请求(完成在线表单)。到目前为止,我无法获得结果页面 - 现在已经两天了。
我使用的代码是:
import sys
import urllib, urllib2, cookielib
from BeautifulSoup import BeautifulSoup
import re
class website:
def __init__(self):
self.host = 'demo.travelportuniversalapi.com'
self.ua = 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:23.0) Gecko/20100101 Firefox/23.0'
self.session = cookielib.CookieJar() #session devine o instanta a obiectului cookielib
pass
def get(self):
try:
url = 'http://demo.travelportuniversalapi.com/(S(cexfuhghvlzyzx5n0ysesra1))/Search' #this varies every 20 minutes
data = None
headers = {'User-Agent': self.ua}
request = urllib2.Request(url, data, headers)
self.session.add_cookie_header(request)
response = urllib2.urlopen(request)
self.session.extract_cookies(response, request)
url = response.geturl()
data = {'From': 'lhr', 'To': 'ams', 'Departure' : '9/4/2013','Return' : '9/6/2013'}
headers = {'User-Agent': self.ua, "Content-type": "application/x-www-form-urlencoded; charset=UTF-8",
}
request = urllib2.Request(url, urllib.urlencode(data), headers, 20)
self.session.add_cookie_header(request)
response = urllib2.urlopen(request, timeout=30) #HTTP Error 404: Not Found - aici am eroare
self.session.extract_cookies(response, request)
except urllib2.URLError as e:
print >> sys.stderr, e
return None
rt = website()
rt.get()
我最后收到的错误urllib2.Request
是 HTTP 错误 404:未找到。我不确定我的 cookie 是否有效。
在浏览器中使用插件监控 HTTP 数据包 当 POST 在浏览器中发送时,我注意到以下标头:'X-Requested-With XMLHttpRequest' — 这是否相关?