0

这些是我想要做的步骤:

  1. 获取http://www.skyscanner.es/的 HTML 代码,搜索航班。
  2. 仅获取该 HTML 的一部分:具有价格的特定“跨度”。
  3. 用它操作。

这是我所做的 PHP 代码:

    <?php
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/");
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt ($ch, CURLOPT_POST, 1);
        curl_setopt ($ch, CURLOPT_POSTFIELDS, "from=Bilbao (BIO)&to=Barcelona (BCN)&depdatetext=25/03/2013&sc_returnOrOneWay=2");
        $output = curl_exec($ch);
        curl_close($ch);
        echo $output;
    ?>

但我得到一个像这样的奇怪字符串:

     ‹¥TkoÚ0ý^‰ÿpTi“ê&lt; t%<¤RuR»U+{}4ñ…X5qf›×Pÿûì$ZõÛ‚Äu¬sî=çú:ýÓ믣Éï‡1¤f!àáû§»Ï#ðHül‚àzr ¿n'÷wù!<Åã/x©1yëõÚ_·}©æÁä[°qY"G«–DŸæ 'ý¢Êf!2=x#CÔívKb FÊ\\ ¡àÐÿ,ùjàdf03d²Íу¤|x7&pì$)UÍ€kI®®:]yKe¸8¼;@àv2y€ª),520h’ Ö`R®!§s3i€ !×Èü~Pòm"m¶ÁXUÝDëBô)!“©dÛÝ‚ª9Ïâ°7³‰æ1ö?à¢|ÑÛø*F3z§ânQ¬ÐðÄîhši¢QñYoJ“§¹’ËŒÅÍqñôž'3Ž‚Y“»œ2ƳyBÔÉ7…îÏ®zÏÐ8I£Ý¡~Ë¿°ja‰RÅÍ››—/m!£BêkähÚ§ÌÛ~nÐEýÐýö´0¬iMw¨¨vkÎLw/ÏêeoæÒ&iA^ôÌ3 §Ë$E÷Þ9Ô=<êØ‘3{uûHµß)gºYMÏî…[1—š.³X¡ †¯Ð¡ý M\¤<³FŽÏÆ•{mŒ™ÇWö0öÆ\{ÞÎNˆ  ­bµ¿nœ\d|œÙ›SôÐöÓhøˆÊÎ0Œ•’Ê2¢a?°°ct¥ÙM'›‰ Z×û/6á~¦úië?®Š%—IÚÃIŠ%h+—@‚òÉöfRAB3Gœ"0®sA·¶Àj+Í€g+*8ûH%ƒwµ”÷°¦ú Ç\ä¦ÒåÊ·¿Aí¨îK÷m-¾vñà-ú¡ 

所以,我连第一步都没跨过!

我试图以多种方式修复它,但我还不知道我做错了什么。我想这可以是:

请问有人可以帮我吗?

提前致谢!

编辑:我已经更改了标题,更接近我现在遇到的问题。

4

2 回答 2

2

由于您收到以下消息,因此在正文中编码了什么消息并不重要:

HTTP/1.1 405 Method Not Allowed

这意味着你不能使用POST.

如果您阅读响应的所有标题,您会看到其中一个标题说:

Allow: GET, HEAD, OPTIONS, TRACE

如果您要删除这两行:

curl_setopt ($ch, CURLOPT_POST, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, "from=Bilbao (BIO)&to=Barcelona (BCN)");

并改变:

curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/");

进入:

curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/vuelos/bio/bcn/130325/tarifas-de-bilbao-a-barcelona-en-marzo-2013.html");

它会工作的。

签出以下代码:

<?php

    $accept = array(
        'type' => array('application/rss+xml', 'application/xml', 'application/rdf+xml', 'text/xml'),
        'charset' => array_diff(mb_list_encodings(), array('pass', 'auto', 'wchar', 'byte2be', 'byte2le', 'byte4be', 'byte4le', 'BASE64', 'UUENCODE', 'HTML-ENTITIES', 'Quoted-Printable', '7bit', '8bit'))
    );
    $header = array(
        'Accept: '.implode(', ', $accept['type']),
        'Accept-Charset: '.implode(', ', $accept['charset']),
    );
    $encoding = null;
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/vuelos/bio/bcn/130325/tarifas-de-bilbao-a-barcelona-en-marzo-2013.html?flt=1");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//    curl_setopt ($ch, CURLOPT_POST, 1);
//    curl_setopt ($ch, CURLOPT_POSTFIELDS, "from=Bilbao (BIO)&to=Barcelona (BCN)");
    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    $response = curl_exec($ch);
    curl_close($ch);        
    if (!$response) {
        // error fetching the response
    } else {
        echo $response;
    }
?>
于 2013-02-13T21:47:45.733 回答
0

I thought that it was using POST method because I get a page whithout prices.

Now I realize that the URL were relatives, so scrips were not loaded. I've add base tag.

[code before]
$result = str_replace("<head>", "<head><base href=\"$skyScannerURL\" />", $response);

Now it has styles and try to load something, but it enter in a bucle, the page is reloaded and the URL has a parameter increasing, it is: ?crty=107

The full code:

$accept = array(
    'type' => array('application/rss+xml', 'application/xml', 'application/rdf+xml', 'text/xml'),
    'charset' => array_diff(mb_list_encodings(), array('pass', 'auto', 'wchar', 'byte2be', 'byte2le', 'byte4be', 'byte4le', 'BASE64', 'UUENCODE', 'HTML-ENTITIES', 'Quoted-Printable', '7bit', '8bit'))
);
$header = array(
    'Accept: '.implode(', ', $accept['type']),
    'Accept-Charset: '.implode(', ', $accept['charset']),
);
$encoding = null;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/vuelos/bio/bcn/130325/tarifas-de-bilbao-a-barcelona-en-marzo-2013.html?flt=1");
//curl_setopt($ch, CURLOPT_URL, "http://www.skyscanner.es/flights/bio/bcn/130325/airfares-from-bilbao-to-barcelona-in-march-2013.html?flt=1");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
$response = curl_exec($ch);
curl_close($ch);        
if (!$response) {
    // error fetching the response
} else {
    $skyScannerURL = 'http://www.skyscanner.es/';
    $result = str_replace("<head>", "<head><base href=\"$skyScannerURL\" />", $response);
    echo $result;
}

You can see online here: codepad.viper-7.com

Obvious something is not working well. Thanks again everyone.

于 2013-02-14T17:57:13.100 回答