3

I want to run Naive Bayes classifier in Mahout for a classification problem.
I have searched everywhere on how to format my input, and how to specify the input to mahout, but have not found any useful information.

The only page which was even remotely useful was
What are the steps needed to use Mahout Native Bayes Classifier Algorithm?

But, even there the the author of the answer seems to have used a custom script called tt, for parsing the input.

If someone out there knows how to give inputs to Mahout algorithms, please help..

4

1 回答 1

1

我找到了以下网站:http ://chimpler.wordpress.com/2013/03/13/using-the-mahout-naive-bayes-classifier-to-automatically-classify-twitter-messages/ 。

显然 Mahout 本身也在格式化方面提供了一些帮助。您可以将一个选项传递给 mahout 二进制文件,称为 seq2encoded、seq2sparse、seqdirectory 等。关于它们的使用我不知道很多细节。这个网站有更多:https ://cwiki.apache.org/confluence/display/MAHOUT/Quick+tour+of+text+analysis+using+the+Mahout+command+line 。

我不认为这会带你一路走来,但希望它会有所帮助。

EDIT1:https ://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors 。

EDIT2:http ://www.datastax.com/dev/blog/apache-mahout-in-datastax-enterprise-building-a-classification-system

于 2013-09-24T19:16:24.537 回答