0

My level of frustion is maxxing out over crawling Dokuwiki sites.

I have a content source using FAST search for SharePoint that i have set up to crawl a dokuwiki/doku.php site. My crawler rules are set to: http://servername/* , match case and include all items in this path with crawl complex urls.. testing the content source in the crawl rules shows that it will be crawled by the crawler. However..... The crawl always last for under 2 minutes and completes having only crawled the page I pointed to and no other link on that page. I have check with the Dokuwki admin and he has the robots text set to allow. when I look at the source on the pages I see that it says meta name="robots" content="index,follow"

so in order to test that the other linked pages were not a problem, I added those links to the content souce manually and recrawled.. example source page has three links

  • site A
  • site B
  • site C.

I added Site A,B and C urls to the crawl source. The results of this crawl are 4 successes, the primary souce page and the other links A,B, and C i manually added.

So my question is why wont the crawler crawl the link on the page? is this something I need to do with the crawler on my end or is it something to do with how namespaces are defined and links constructed with Dokuwiki?

Any help would be appreciated

Eric

4

2 回答 2

0

这个问题是关于身份验证的,尽管没有报告任何问题表明它是 FAST Crawl Logs 中的身份验证。修复是为搜索索引服务器的 IP 地址添加一个 $freepass 设置,这样 Appache 就不会为每个页面命中进行身份验证过程。

谢谢回复

埃里克

于 2011-07-14T18:46:28.453 回答
0

您是否禁用了延迟索引选项和 rel=nofollow 选项?

于 2011-07-12T12:27:50.963 回答