5

我有以下功能来获取 googlebot 的最后访问日期:

//get googlebot last access
function googlebot_lastaccess($domain_name)
{
    $request = 'http://webcache.googleusercontent.com/search?hl=en&q=cache:'.$domain_name.'&btnG=Google+Search&meta=';
    $data = getPageData($request);
    $spl=explode("as it appeared on",$data);
   //echo "<pre>".$spl[0]."</pre>";
    $spl2=explode(".<br>",$spl[1]);
    $value=trim($spl2[0]);
   //echo "<pre>".$spl2[0]."</pre>";
    if(strlen($value)==0)
    {
        return(0);
    }
    else
    {
        return($value);
    }      
} 

echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />"; 

function getPageData($url) {
 if(function_exists('curl_init')) {
 $ch = curl_init($url); // initialize curl with given url
 curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']); // add useragent
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // write the response to a variable
 if((ini_get('open_basedir') == '') && (ini_get('safe_mode') == 'Off')) {
 curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // follow redirects if any
 }
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); // max. seconds to execute
 curl_setopt($ch, CURLOPT_FAILONERROR, 1); // stop when it encounters an error
 return @curl_exec($ch);
 }
 else {
 return @file_get_contents($url);
 }
}

但是这个脚本打印我作为屏幕中整个页面的快照,即。整个页面缓存在谷歌中,但我只想捕获单词后的日期时间as it appeared on并打印它,即:8 Oct 2011 14:03:12 GMT

如何?

4

2 回答 2

5

更改此行:

echo "Googlebot last access = ".googlebot_lastaccess($domain_name)."<br />";

有了这个:

$content = googlebot_lastaccess($domain_name);
$date = substr($content , 0, strpos($content, 'GMT') + strlen('GMT'));
echo "Googlebot last access = ".$date."<br />"; 
于 2011-10-14T09:52:39.090 回答
3

当您可以在您的网站上检测到 Googlebot 以及它在哪些页面上时,为什么要向 Google 查询它上次出现在您的网站上的时间?它还允许您通过简单的数据库写入功能跟踪 Googlebot 的去向。

请参阅 Stack Overflow 问题如何使用 php 检测搜索引擎机器人?

于 2012-11-15T14:27:52.500 回答