这是一个愚蠢的问题,但必须有人问。
我已经尝试在本地运行 Mahout,这很有效。现在,我希望由远程集群执行工作,而不是我的本地机器。
那么,我应该在 Hadoop 机器上部署 Mahout 代码,还是仍然可以在本地机器接口上使用 Hadoop 远程制作 Mahout?
No, you don't install Hadoop programs on the Hadoop workers yourself. That would be a nightmare to maintain. Hadoop does it for you when you provide it the JAR file with all code via hadoop jar
.
What runs on your local machine, when you run Mahout or anything else Hadoop-based, is a client program that uses Hadoop code to send info to a cluster to start work. That cluster might be local, or remote -- makes no difference to how you run the client, just what the client talks to.