3

I am trying to serve a robots.txt using the Perl Dancer web framework. I thought having a route that just returned the text would work, but it seems to be wrapping it in html and body tags. I'm assuming this won't be interpreted properly as a robots.txt file by crawlers.

Any idea how to do this properly?

Here is how I have the route written:

get '/robots.txt' => sub { return "User-agent: *\nDisallow: /"; };

Thanks in advance!

4

3 回答 3

12

是什么让您认为它被包裹在 HTML 和 BODY 元素中?

use Dancer;

get '/robots.txt' => sub {
   return "User-agent: *\nDisallow: /\n";
};

dance;

输出:

>lwp-request -e http://127.0.0.1:3000/robots.txt
200 OK
Server: Perl Dancer 1.3112
Content-Length: 26
Content-Type: text/html
Client-Date: Mon, 29 Apr 2013 05:05:32 GMT
Client-Peer: 127.0.0.1:3000
Client-Response-Num: 1
X-Powered-By: Perl Dancer 1.3112

User-agent: *
Disallow: /

我敢打赌,您正在使用使用渲染器的客户端查看它,该渲染器在看到text/html. 将内容类型设置text/plain为更合适,并且在您用于查看文件的渲染器中看起来更好。

get '/robots.txt' => sub {
   content_type 'text/plain';
   return "User-agent: *\nDisallow: /\n";
};

但最终,它不应该有任何影响。

于 2013-04-29T05:06:31.747 回答
6

The other option for sending robots.txt would be to not define a route for it and instead put an actual robots.txt file into the public/ subdirectory under your main Dancer app directory. Dancer will then serve it automatically as a regular file without passing it through the route handlers, templates, etc.

于 2013-04-29T08:30:17.283 回答
3

您将响应作为text/html(默认)。作为解析 HTML 的正常过程的一部分,浏览器正在插入元素(您正在查看实时 DOM 的表示,而不是源代码)。

设置正确的内容类型标头。

get '/robots.txt' => sub {
  content_type "text/plain";
  return "User-agent: *\nDisallow: /";
};
于 2013-04-29T05:11:23.503 回答