1

我正在尝试从 Java(通过 HTTP)向某个网站提交表单,但是在阅读响应时,我看不到我的预期。我具体做什么:首先,我在浏览器中打开网站,手动填写表格并提交。在 Chrome 中,我可以看到通过网络传输的数据,即:

Request URL:http://wizzair.com/en-GB/Select
Request Method:POST
Status Code:200 OK
Request Headersview source
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:1061
Content-Type:application/x-www-form-urlencoded
Cookie:WRUID=0; ASP.NET_SessionId=3e3ahach1d34oyhtoqfshxhe; Culture=en-GB; __utma=17431487.361991764.1292186668.1354138010.1354651562.81; __utmb=17431487.9.9.1354652614319; __utmc=17431487; __utmz=17431487.1319145359.34.18.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=wizz
Host:wizzair.com
Origin:http://wizzair.com
Referer:http://wizzair.com/en-GB/Select
User-Agent:Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.91 Safari/537.11
Form Dataview URL encoded
__EVENTTARGET:HeaderControlGroupRibbonSelectView_AvailabilitySearchInputRibbonSelectView_ButtonSubmit
__VIEWSTATE:/wEPDwUBMGRkNSMYF94e4mXCiiJGEJbRixyidoa2QXSambTT2mm6cLs=
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$OriginStation:EIN
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$DestinationStation:OTP
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$DepartureDate:02/02/2013
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$ReturnDate:05/02/2013
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountADT:1
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountCHD:0
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountINFANT:0
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$BaggageCount:0
HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$ButtonSubmit:Search

所以我尝试模拟来自Java程序的相同请求,即:

        public void doSubmit(String url, Map<String, String> data) throws Exception {
    URL siteUrl = new URL(url);
    HttpURLConnection conn = (HttpURLConnection) siteUrl.openConnection();
    conn.setRequestMethod("POST");
    conn.setDoOutput(true);
    conn.setDoInput(true);

    conn.setRequestProperty("Cookie", "WRUID=0; ASP.NET_SessionId=3e3ahach1d34oyhtoqfshxhe; Culture=en-GB; __utma=17431487.361991764.1292186668.1354138010.1354651562.81; __utmb=17431487.9.9.1354652614319; __utmc=17431487; __utmz=17431487.1319145359.34.18.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=wizz");

    DataOutputStream out = new DataOutputStream(conn.getOutputStream());

    Set keys = data.keySet();
    Iterator keyIter = keys.iterator();
    String content = "";
    for(int i=0; keyIter.hasNext(); i++) {
        Object key = keyIter.next();
        if(i!=0) {
            content += "&";
        }
        content += key + "=" + URLEncoder.encode(data.get(key), "UTF-8");
    }
    System.out.println(content);
    out.writeBytes(content);
    out.flush();
    out.close();
    BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
    String line = "";
    while((line=in.readLine())!=null) {
        System.out.println(line);
    }
    in.close();
}

.... 我使用以下参数调用它,如上面的 HTTP 表单数据所示:

    String url = "http://wizzair.com/en-GB/Select";
    Map<String, String> data = new TreeMap<String, String>();
    data.put("__EVENTTARGET", "HeaderControlGroupRibbonSelectView_AvailabilitySearchInputRibbonSelectView_ButtonSubmit");
    data.put("__VIEWSTATE", "/wEPDwUBMGRkNSMYF94e4mXCiiJGEJbRixyidoa2QXSambTT2mm6cLs=\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$OriginStation:EIN\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$DestinationStation:OTP\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$DepartureDate:02/02/2013\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$ReturnDate:05/02/2013\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountADT:1\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountCHD:0\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$PaxCountINFANT:0\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$BaggageCount:0\n"+
            "HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$ButtonSubmit:Search"

但是,我得到的响应只是来自该网站的通用网页,而不是我预期的答案。我究竟做错了什么 ?

非常感谢, 问候, 索林

4

1 回答 1

1

我认为您填充帖子数据的方式不正确。您应该有十个左右的键/值对,而不仅仅是两个。您在 Chrome 中看到的第二项不是一个大字符串。'HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$OriginStation' 之类的东西本身就是一个关键。这也适用于其他非常难看的命名键,它们都以 '​​HeaderControlGroupRibbonSelectView$AvailabilitySearchInputRibbonSelectView$' 开头。

此外,您还需要对每个发布数据项的“键”进行编码,而不仅仅是键的值(因为里面的 $ 字符)。使用单独的调用对双方进行编码,以避免对赋值“=”进行编码。

您还需要在原始第二个末尾删除换行符,但现在拆分键/值,因为它不存在。这也适用于第三、第四等项。

小心解释您在 Chrome 中看到的内容:-P

另一件事:您所指的网站是基于会话的:它使用 cookie 值“ASP.NET_SessionId”跟踪您的当前交互状态。这个价值只是短暂的。一般来说,您应该首先在没有此 cookie 值的情况下调用本网站,然后网站将为您提供(将您重定向到特定国家/地区的本地化包罗万象的页面)。随后,您可以在您的(第二个)请求中使用它的值来收集数据。如果您提供了无效的会话 ID,您将被一次又一次地重定向到相同的默认页面。

于 2013-09-16T08:08:19.223 回答