我正在为我的研究开发一个原型,我有训练有素的模型,我想用我的测试集进行测试。
我正在使用 c# 来开发程序。这是代码
Classifier cls = null;
cls = (Classifier)weka.core.SerializationHelper.read("Multinomial.model");
//sample arff file
//@relation emotion_class
//@attribute Text string
//@attribute class_list {Happy,Fear,Anger,Sad,Neutral}
//@data
//" ang saya saya #PlanPhilippines team interviewing #Bopha survivors Mindanao. ",
StringBuilder buffer = new StringBuilder(sample);
BufferedReader reader = new BufferedReader(new java.io.StringReader(buffer.ToString()));
weka.core.converters.ArffLoader.ArffReader arff = new weka.core.converters.ArffLoader.ArffReader(reader);
Instances dataRaw = arff.getData();
//converts tweets to string to word vector
StringToWordVector filter = new StringToWordVector();
filter.setInputFormat(dataRaw);
Instances dataFiltered = Filter.useFilter(dataRaw, filter);
dataFiltered.setClassIndex(dataFiltered.numAttributes() - 1);
for (int i = 0; i < dataFiltered.numInstances();i++ )
{
Instance inst = dataFiltered.instance(i);
double classified = cls.classifyInstance(inst);
MessageBox.Show(dataFiltered.classAttribute().value((int)classified));
}
我收到一个错误:无法规范化数组。总和是 NaN。在double classified = cls.classifyInstance(inst);
但是当我将字符串删除到词向量过滤器时..没有错误但它输出错误的结果它总是给出 1 或快乐。
谢谢!