8

我正在尝试抓取一个使用不同货币的 cookie 的购物车。当我在 chrome 浏览器中加载站点并使用Cookie Inspectorfor检查时Chrome,它显示以下 cookie。 在此处输入图像描述

当我尝试使用 cURL 加载相同的链接时

.example.com    TRUE    /   FALSE   1462357306  SSNC    CCSUBMIT-N
.example.com    TRUE    /   FALSE   1462357306  SSOE    PSORT-Y::CWR-on
.example.com    TRUE    /   FALSE   1464947780  SSLB    1
.example.com    TRUE    /   FALSE   1493891506  SSID_C  CACeuh1GAAAAAAAYxilXjl6BJhjGKVcBAAAAAABEVFFXGMYpVwANyBJPAAP1PQoAGMYpVwEAF04AA6sdCgAYxilXAQAOUAAD7V4KABjGKVcBACNQAAFUYgoAGMYpVwEAbk8AAQBICgAYxilXAQA
.example.com    TRUE    /   FALSE   0   SSSC_C  333.G6280768962372394638.1|19991.662955:20242.671221:20334.673792:20494.679661:20515.680532
.example.com    TRUE    /   FALSE   1493891506  SSRT_C  MsYpVwIBAw
.example.com    TRUE    /   FALSE   0   JSESSIONID  CDZHXpGSHymLMz4v!-751026475
.example.com    TRUE    /   FALSE   3609839127  mapp    0
.example.com    TRUE    /   FALSE   3609839153  dpi 2097201|2|release20160420v10t155721155722
.example.com    TRUE    /   FALSE   3609839153  lpi 2114737|2|release20160420v10t155721155722
.example.com    TRUE    /   FALSE   0   TS0119d048  01efad4706976f70b8f767b422999889abdfa7e7a9a300a247ca3f6dec4997a3ea8a5c9dbe800783f83027f6f389b2fc4134a3806b1de11ca96bf39add105698b8c22f1d300d568ea4395ae6adf29723d2f482180be92caa38977c2da954baebe461814696e5ca8be3f2f7087360909df7e5694ec8f5965475bfd2591cc6c843a2b4aac4752758d5cb2659b390c7632b7047ffdfe2
www.example.com FALSE   /   FALSE   0   TS01472329  01efad4706512021fdee50b1b891941c232f4ef7f5bf2d184606446c9ebf492848a3eab610
.example.com    TRUE    /   FALSE   3609839153  uui 800.606.6969%20/%20212.444.6615|
.example.com    TRUE    /   FALSE   0   ci  NS=Y|CM_MMC=|
.example.com    TRUE    /   FALSE   0   TS01c1e793  01efad47067448a038c37bf93bcdabbce3f89810c9711adfcf2561c8b38484b01c4523479562e5435383034ba6b231a0e3428234fab56386e2af0810f02b7abcf5f2d79d6e
.example.com    TRUE    /   FALSE   3609839153  sessionKey  CDZHXpGSHymLMz4v!-751026475!1462355506133
.example.com    TRUE    /   FALSE   3609839127  cookieID    89789790961462355480485
.example.com    TRUE    /   FALSE   0   dlc NS=Y|CM_MMC=|EMLH=|

这显然错过了图像中突出显示的 cookie。我还尝试删除所有 cookie 并禁用 JS 并在浏览器中重新加载页面,但这两个 cookie 仍然存在。所以这些cookies不是使用JS创建的。

我使用的代码:

$URL = "http://www.example.com/";
//ini_set('user_agent', 'Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.0 FirePHP/0.5 ');
//$context = stream_context_create (array ('http' => array ('timeout' => 60)));
$this->ch = curl_init();
$curlHeaders = array(
        'Host: www.example.com',
        'Connection: keep-alive',
        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
        'Upgrade-Insecure-Requests: 1',
        'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36',
        'Accept-Encoding: gzip, deflate, sdch',
        'Accept-Language: en-US,en;q=0.8',
        'Cookie: _gat=1'
);


$cookie = 'cookies.txt';

// visit the homepage to set the cookie properly
//$ch = curl_init();

$agent= 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13';
curl_setopt($this->ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($this->ch, CURLOPT_VERBOSE, true);
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($this->ch, CURLOPT_HEADER, false);
curl_setopt($this->ch, CURLOPT_HTTPGET, true);
curl_setopt($this->ch, CURLOPT_USERAGENT, $agent);
curl_setopt($this->ch, CURLOPT_HTTPHEADER, $curlHeaders);
curl_setopt($this->ch, CURLOPT_URL, $URL);
curl_setopt($this->ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($this->ch, CURLOPT_COOKIESESSION, true);
curl_setopt($this->ch, CURLOPT_FOLLOWLOCATION, true);

ob_start();      // prevent any output
curl_exec ($this->ch); // execute the curl command
ob_end_clean();  // stop preventing output

//URL that loads when I change the currency from USD to AUD
    $ausURL = "http://www.example.com/bnh/controller/home?O=RootPage.jsp&A=SetCurrency&Q=&saveCUR=Y&code=AUD";

    curl_setopt($this->ch, CURLOPT_URL, $ausURL);


$url="www.example.com/productPage/";
curl_exec ($this->ch);
curl_setopt($this->ch, CURLOPT_ENCODING, "gzip");
curl_setopt($this->ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($this->ch, CURLOPT_REFERER, "http://www.example.com/bnh/controller/home?O=RootPage.jsp&A=SetCurrency&Q=&saveCUR=Y&code=AUD");
curl_setopt($this->ch, CURLOPT_URL,$url);
curl_setopt($this->ch, CURLOPT_COOKIEFILE, $cookie);    
$buffer = curl_exec($this->ch);
$fh = fopen($this->myFile,'w') or die("can't open file");
fwrite($fh, $buffer." -----------------buffer--------------------");
//fclose($fh);
return $buffer;

它仍然通过 CURL 产生美元定价。

4

1 回答 1

0

您尝试解析的站点受 DISTIL http://www.distilnetworks.com/保护。他们使用各种方法来检测解析内容并防止价格抢购。

DISTIL 将隐藏脚本放入每个页面,用于验证浏览器。所以对于正常的工作,网站还需要启用 JAVASCRIPT。

于 2016-05-07T18:44:49.360 回答