seo - 谷歌标签

Question

我想告诉谷歌不要索引页面的某些部分。在 Yandex（俄罗斯 SE）中有一个非常有用的标签，叫做<noindex>. 谷歌如何做到这一点？

score 26 · Accepted Answer

根据 Wikipedia ¹，一些蜘蛛遵循一些规则：

<!--googleoff: all-->
This should not be indexed by Google. Though its main spider, Googlebot,
might ignore that hint.
<!--googleon: all-->

<div class="robots-nocontent">Yahoo bots won't index this.</div>

<noindex>Yandex bots ignore this text.</noindex>
<!--noindex-->They will ignore this, too.<!--/noindex-->

不幸的是，他们似乎无法就单一标准达成一致——据我所知，没有什么可以阻止所有蜘蛛......

该googleoff:评论似乎支持不同的选项，但我不确定哪里有完整的列表。至少有：

all：完全忽略该块
索引：内容不进入谷歌的索引
anchor：链接的锚文本不会与目标页面相关联
片段：文本不会用于为搜索结果创建片段

还要注意（至少对于谷歌）这只会影响搜索索引，而不是页面排名等。此外，正如斯蒂芬奥斯特米勒在下面的评论中正确指出的那样，googleon仅googleoff 适用于谷歌搜索设备，对不幸的是，普通的 Googlebot 。

还有一篇关于 Yahoo 第²部分的文章（以及一篇描述 Yandex 也获得<noindex>⁶荣誉的文章）。在这googleoff:部分，也可以看到这个答案，以及我从中获取大部分相关信息的文章。³

此外，Google 网站管理员工具建议对特定链接使用rel=nofollow属性⁴（例如广告或指向机器人无法访问/无用的页面的链接，例如登录/注册）。这意味着，Google 机器人应该尊重HTML a rel 属性——尽管这主要与页面排名有关，而不是与搜索索引本身有关。不幸的是，似乎没有rel=noindex^5,7。我也不确定这个属性是否也可以用于其他元素（例如<DIV REL="noindex">）；但除非爬虫尊重“noindex”，否则这也是没有意义的。

更多参考资料：

如何对网页的某些部分进行 Noindex 索引？
从页面部分中排除爬虫（Spiderline 爬虫；您会看到，其他爬虫可能使用其他专有标记（另请参见AddSearch爬虫）。我希望他们只是制定REL="noindex"一个标准，而不是与任何 HTML 标记一起使用，例如 DIV/SPAN/P/一种！）
通过反转字符串来防止 Google 索引 div 的内容
防止搜索引擎索引页面上不相关内容的方法

¹ 维基百科：Noindex
² 您网页的哪些部分可能会被搜索引擎忽略？
³ 告诉 Google 不要索引您页面的某些部分
⁴ 对特定链接使用 rel="nofollow"
⁵ 使用是一个好主意<a href=“http://name.com” rel=“noindex, nofollow”>name</a>吗？
⁶ 使用 HTML 标签 — Yandex.Help。站长
⁷ 现有的 REL 值

score 8 · Accepted Answer

您可以通过将这些部分放入被 robots.txt 阻止的 iframe 中来阻止 Google 看到页面的某些部分。

机器人.txt

Disallow: /iframes/

索引.html

This text is crawlable, but now you'll see 
text that search engines can't see:
<iframe src="/iframes/hidden.html" width="100%" height=300 scrolling=no>

/iframes/hidden.html

Search engines cannot see this text.

您可以使用 AJAX 加载隐藏文件的内容，而不是使用 iframe。这是一个使用 jquery ajax 执行此操作的示例：

his text is crawlable, but now you'll see 
text that search engines can't see:
<div id="hidden"></div>
<script>
    $.get(
        "/iframes/hidden.html",
        function(data){$('#hidden').html(data)},
    );
</script>

score 3 · Accepted Answer

3

不，谷歌不支持<noindex>标签。几乎没有人这样做。

于 2013-03-28T15:03:31.993 回答

score -5 · Accepted Answer

在根级别创建一个 robots.txt 文件并插入如下内容：

阻止谷歌：

User-agent: Googlebot
Disallow: /myDisallowedDir1/
Disallow: /myDisallowedPage.html
Disallow: /myDisallowedDir2/

阻止所有机器人：

User-agent: *
Disallow: /myDisallowedDir1/
Disallow: /myDisallowedPage.html
Disallow: /myDisallowedDir2/

一个方便的 robots.txt 生成器：

http://www.mcanerin.com/EN/search-engine/robots-txt.asp

seo - 谷歌标签

4 回答 4

Related

Reference