amazon-s3 - Google 不会在 s3 上读取我的 robots.txt

Question

由于谷歌正在抓取我们的静态内容（存储在 s3 上），我们在（存储桶的）根目录中创建了一个 robots.txt .. 如下

User-agent: *
Disallow: /

现在，因为我们想从谷歌搜索中删除现有内容..我们在谷歌网站管理员 a/c 中添加了存储桶..并请求删除目录

现在的问题是 google 不会识别或读取 robots.txt（这需要将来阻止 google block）。我们在网站管理员工具中收到此消息

此站点不位于域的顶层。robots.txt 文件仅在位于最高级别目录时才有效，并且适用于域内的所有目录。适用于您网站（如果存在）的 robots.txt 文件位于http://s3.amazonaws.com/robots.txt。此页面提供有关该文件的信息。

score 11 · Accepted Answer

Which URL did you give Google for your bucket? You need to use the DNS-style {bucket}.s3.amazonaws.com, instead of the path-style s3.amazonaws.com/{bucket}.

score -1 · Accepted Answer

当我尝试查看您的 robots.txt 时收到拒绝访问错误，您确定 Google 可以看到您的 robots 文件吗？

此外，您可以在 Google上实时查看您的 robots.txt，并准确确认 Google 在查看您的 robots.txt 时所看到的内容（如果他们甚至可以看到的话）。

amazon-s3 - Google 不会在 s3 上读取我的 robots.txt

2 回答 2

Related

Reference