0

我想用木兰对一些数据进行分类。但我得到一个例外:

mulan.data.DataLoadException: Error creating Instances data from supplied Reader data source
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:469)
at mulan.data.MultiLabelInstances.loadInstances(MultiLabelInstances.java:458)
at mulan.data.MultiLabelInstances.<init>(MultiLabelInstances.java:168)

主要功能来自 mulan.examples.TrainTestExperiment

public class TrainTestExperiment {

    public static void main(String[] args) {
        try {
            String path = Utils.getOption("path", args); // e.g. -path dataset/
            String filestem = Utils.getOption("filestem", args); // e.g. -filestem emotions
            String percentage = Utils.getOption("percentage", args); // e.g. -percentage 50 (for 50%)

            System.out.println("Loading the dataset");
            MultiLabelInstances mlDataSet = new MultiLabelInstances(path + filestem + ".arff", path + filestem + ".xml");

            // split the data set into train and test
            Instances dataSet = mlDataSet.getDataSet();
            RemovePercentage rmvp = new RemovePercentage();
            rmvp.setInvertSelection(true);
            rmvp.setPercentage(Double.parseDouble(percentage));
            rmvp.setInputFormat(dataSet);
            Instances trainDataSet = Filter.useFilter(dataSet, rmvp);

            rmvp = new RemovePercentage();
            rmvp.setPercentage(Double.parseDouble(percentage));
            rmvp.setInputFormat(dataSet);
            Instances testDataSet = Filter.useFilter(dataSet, rmvp);

            MultiLabelInstances train = new MultiLabelInstances(trainDataSet, path + filestem + ".xml");
            MultiLabelInstances test = new MultiLabelInstances(testDataSet, path + filestem + ".xml");

            Evaluator eval = new Evaluator();
            Evaluation results;

            Classifier brClassifier = new NaiveBayes();
            BinaryRelevance br = new BinaryRelevance(brClassifier);
            br.setDebug(true);
            br.build(train);
            results = eval.evaluate(br, test);
            System.out.println(results);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

至于数据格式,我有一个称为标题的维度,有 160 个类别。

数据文件按照arff格式格式化。

有些文字是中文的。

任何帮助表示赞赏。

此致

4

1 回答 1

0

这看起来像木兰中的一个错误。

在此处查看有关该错误的更多详细信息。

于 2012-12-30T07:47:43.320 回答