Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
我们如何在流中禁用注入子域?现在,如果我们注入www.ebay.com流而不是注入,我们就有子域页面:my.ebay.com, community.ebay.com, ...
www.ebay.com
my.ebay.com
community.ebay.com
您可以通过在urlfilters.json 中将 ignoreOutsideHost设置为 true来配置 HostURLFilter 以排除种子主机名之外的 URL
{ "class": "com.digitalpebble.stormcrawler.filtering.host.HostURLFilter", "name": "HostURLFilter", "params": { "ignoreOutsideHost": true, "ignoreOutsideDomain": true } }