0

We have an e-commerce website. Due to some marketing and promotional campaign we are showing app download page/banner/promotion/big image (and nothing else) on our home page if the user is visiting the site for the first time which is cookie based.

But I don't want bots/crawler to see this content(big image) instead they should see the real content which comes after setting up the cookie. URL is same for both the content.

I can clarify more on this. How can I avoid the bots seeing the promotional content?

4

2 回答 2

1

您需要一个robots.txt文件。

来自维基百科

机器人排除标准,也称为机器人排除协议或 robots.txt 协议,是建议合作的网络爬虫和其他网络机器人访问网站的全部或部分内容的约定,否则该网站是公开可见的。搜索引擎经常使用机器人对网站进行分类和存档,或者网站管理员使用机器人来校对源代码。该标准不同于站点地图,但可以与站点地图结合使用,站点地图是网站的机器人包含标准。

请记住,如果机器人是“邪恶的”,它们可以简单地忽略这些指令;但是,只要您正确设置,Google 和其他搜索引擎应该遵守它。

于 2014-04-11T22:17:56.790 回答
0

现在我正在使用这个函数来检测 php 控制器代码中的机器人/爬虫,并根据需要进行重定向。

function bot_detected()
{
  if 
  (
    !isset($_SERVER['HTTP_USER_AGENT'])
    ||
    empty($_SERVER['HTTP_USER_AGENT'])
    ||
    preg_match('/bot|crawl|slurp|spider/i', $_SERVER['HTTP_USER_AGENT'])
    ||    
    preg_match('/scrappy/python/httpclient/Googlebot|DoCoMo|YandexBot|bingbot|ia_archiver|AhrefsBot|Ezooms|GSLFbot|WBSearchBot|Twitterbot|TweetmemeBot|Twikle|PaperLiBot|Wotbox|UnwindFetchor|facebookexternalhit/i', $_SERVER['HTTP_USER_AGENT'])
   ) 
  {
    return TRUE;
  }
  return FALSE;
}
于 2014-05-11T07:36:47.153 回答