我是文本分类的新手,我想用 WEKA 来实现它。我是否必须像下面的 ARFF 文件那样构建一个有监督的训练集?我必须手动做对吗?在这之后,我该怎么办?使用朴素贝叶斯分类器来预测测试集的类别?
@relation test
@attribute text String
@attribute politics {yes,no}
@attribute religion {yes,no}
@attribute another_category {yes,no}
@data
"this is a text about politics",yes,no,no
"this text is about religion",no,yes,no
"this text mixes everything",yes,yes,yes