下一句在 Wget 的手册中引起了我的注意
wget --spider --force-html -i bookmarks.html
This feature needs much more work for Wget to get close to the functionality of real web spiders.
我发现以下与 wget 中的蜘蛛选项相关的代码行。
src/ftp.c
780: /* If we're in spider mode, don't really retrieve anything. The
784: if (opt.spider)
889: if (!(cmd & (DO_LIST | DO_RETR)) || (opt.spider && !(cmd & DO_LIST)))
1227: if (!opt.spider)
1239: if (!opt.spider)
1268: else if (!opt.spider)
1827: if (opt.htmlify && !opt.spider)
src/http.c
64:#include "spider.h"
2405: /* Skip preliminary HEAD request if we're not in spider mode AND
2407: if (!opt.spider
2428: if (opt.spider && !got_head)
2456: /* Default document type is empty. However, if spider mode is
2570: * spider mode. */
2571: else if (opt.spider)
2661: if (opt.spider)
src/res.c
543: int saved_sp_val = opt.spider;
548: opt.spider = false;
551: opt.spider = saved_sp_val;
src/spider.c
1:/* Keep track of visited URLs in spider mode.
37:#include "spider.h"
49:spider_cleanup (void)
src/spider.h
1:/* Declarations for spider.c
src/recur.c
52:#include "spider.h"
279: if (opt.spider)
366: || opt.spider /* opt.recursive is implicitely true */
370: (otherwise unneeded because of --spider or rejected by -R)
375: (opt.spider ? "--spider" :
378: (opt.delete_after || opt.spider
440: if (opt.spider)
src/options.h
62: bool spider; /* Is Wget in spider mode? */
src/init.c
238: { "spider", &opt.spider, cmd_boolean },
src/main.c
56:#include "spider.h"
238: { "spider", 0, OPT_BOOLEAN, "spider", -1 },
435: --spider don't download anything.\n"),
1045: if (opt.recursive && opt.spider)
我想看看代码的差异,而不是抽象的。我喜欢代码示例。
网络蜘蛛在代码中与 Wget 的蜘蛛有何不同?