0

i've implemented a Single Sign On feature which redirects the user to another domain and back again. Naturally, I don't want search engines (the ones we care about at least) to be redirected, so what's an acceptable solution?

Here's one I found in PHP

$agent = strtolower($_SERVER['HTTP_USER_AGENT']);
if (strpos($agent, "bot") ||
    strpos($agent, "slurp") ||
    strpos($agent, "crawl") ||
    strpos($agent, "google") ||
    strpos($agent, "teoma") ||
    strpos($agent, "spider") ||
    strpos($agent, "feed") ||
    strpos($agent, "index")) {
  return null;
}

Maybe the best solution would actually be to detect and only redirect real users ?

4

2 回答 2

1

最好检查用户代理字符串中是否存在渲染引擎,例如 Gecko/AppleWebKit/Opera/Trident/,因为大多数爬虫不包含此字符串。这样,您将只重定向浏览器。

于 2013-09-05T06:09:10.563 回答
0

我得出的结论是,这仅是识别最受信任和主流的蜘蛛/爬虫的一种可接受的方式。如果用户在其用户代理字符串中有上述任何一项,那么他们要么是蜘蛛,要么是假装是蜘蛛的人。

当然会有蜘蛛/爬虫在他们的用户代理字符串中不包含上述内容,这不会检测到它们。如果这对您很重要,请不要使用此方法,而是寻求替代且定期更新的解决方案,该解决方案可能使用 IP 地址查找

于 2012-09-18T08:58:05.183 回答