我想在我的 Windows 7 x64 上运行 Nutch。我有来自apache.spinellicreations.com/nutch/的 Nutch 版本 1.5.1 和 2 。
我使用了wiki.apache.org/nutch/NutchTutorial上的教程。但是我在第二步搞砸了,无法验证安装。其他步骤很难理解...
nutch的安装和使用步骤是什么?
我想在我的 Windows 7 x64 上运行 Nutch。我有来自apache.spinellicreations.com/nutch/的 Nutch 版本 1.5.1 和 2 。
我使用了wiki.apache.org/nutch/NutchTutorial上的教程。但是我在第二步搞砸了,无法验证安装。其他步骤很难理解...
nutch的安装和使用步骤是什么?
按照步骤
nutch
在 windows中安装:
1) download and install cygwin from : https://www.cygwin.com/
2) download nutch from : http://nutch.apache.org/downloads.html
3) paste nutch downloaded and extracted folder into C:\cygwin64\home\
4) rename to apache-nutch
5) open cygwin terminal and type given commands
- $ cd C:
- $ cd cygwin64
- $ cd home
- $ cd apache-nutch
- $ cd src/bin
- $ ./nutch
你会得到给定的输出:
Usage: nutch COMMAND
where COMMAND is one of:
inject inject new urls into the database
hostinject creates or updates an existing host table from a text file
generate generate new batches to fetch from crawl db
fetch fetch URLs marked during generate
parse parse URLs marked during fetch
updatedb update web table after parsing
updatehostdb update host table after parsing
readdb read/dump records from page database
readhostdb display entries from the hostDB
index run the plugin-based indexer on parsed batches
elasticindex run the elasticsearch indexer - DEPRECATED use the index command instead
solrindex run the solr indexer on parsed batches - DEPRECATED use the index command instead
solrdedup remove duplicates from solr
solrclean remove HTTP 301 and 404 documents from solr - DEPRECATED use the clean command instead
clean remove HTTP 301 and 404 documents and duplicates from indexing backends configured via plugins
parsechecker check the parser for a given url
indexchecker check the indexing filters for a given url
plugin load a plugin and run one of its classes main()
nutchserver run a (local) Nutch server on a user defined port
webapp run a local Nutch web application
junit runs the given JUnit test
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
您没有搞砸第二步 - 您根本没有(我猜)安装 Cygwin,因此您无法运行 bash 脚本。安装 Cygwin(最简单),或者您可以尝试将 bash 脚本移植到 Windows cmd 文件。(如果你这样做,你可能会发现其他依赖项。
希望这可以帮助。