matlab - 在 matlab 中测试 libsvm 时结果不佳

Question

有人可以帮我解决这个问题吗？我想测试这个分类是否已经很好。所以，我尝试数据测试=数据训练。如果分类好，它将给出 100% (acc)。这是我从这个网站找到的代码：

data= [170           66           ;
160            50           ;
170            63           ;
173            61           ;
168            58           ;
184            88           ;
189            94           ;
185            88           ]

labels=[-1;-1;-1;-1;-1;1;1;1];

numInst = size(data,1);
numLabels = max(labels);

 testVal = [1 2 3 4 5 6 7 8];
  trainLabel = labels(testVal,:);
  trainData = data(testVal,:);
  testData=data(testVal,:);
  testLabel=labels(testVal,:);
 numTrain = 8; numTest =8

%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
    model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -t 2 -g 0.2 -b 1');
end

%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
    [~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
    prob(:,k) = p(:,model{k}.Label==1);    %# probability of class==k
end


%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel)    %# accuracy
C = confusionmat(testLabel, pred)                   %# confusion matrix

这是结果：

optimization finished, #iter = 16  
nu = 0.645259 obj = -2.799682, 
rho = -0.437644 nSV = 8, nBSV = 1 Total nSV = 8 
Accuracy = 100% (8/8) (classification)

acc =

    0.3750


C =

     0     5
     0     3

我不知道为什么有两个精度，并且它不同。第一个是 100%，第二个是 0.375。我的代码是假的吗？它应该是 100% 而不是 37.5%。你能帮我纠正这个代码吗？

score 2 · Accepted Answer

如果您使用 libsvm，那么您应该更改 MEX 文件的名称，因为 Matlab 已经有一个名为 svmtrain 的 svm 工具箱。但是，代码正在运行，所以您似乎确实更改了名称，只是没有在您提供的代码上。

第二个是错的，不知道为什么。但是，我可以告诉你，如果你使用 test_Data = training_Data，你几乎总能获得 100% 的准确率。该结果实际上没有任何意义，因为该算法可能会过度拟合并且不会显示在您的结果中。针对新数据测试您的算法，这将为您提供真实的准确性。

score 1 · Accepted Answer

那是你正在使用的代码吗？我认为您的 svmtrain 调用无效。你应该有svmtrain(MAT, VECT, ...)whereMAT是一个数据矩阵，并且VECT是一个带有每行标签的向量MAT。其余参数是字符串值对，这意味着您将拥有一个字符串标识符及其对应的值。

当我运行您的代码（Linux，R2011a）时，我在 svmtrain 调用中遇到错误。运行svmtrain(trainData, double(trainLabel==k))给出了有效的输出（对于该行）。当然，您似乎没有使用纯 matlab，因为您的svmpredict调用不是本机 matlab，而是来自 LIBSVM 的 matlab 绑定......

score 1 · Accepted Answer

C = chaosmat(testLabel, pred)
交换他们的位置

C=confusionmat(pred,testLabel)

或使用这个

[ConMat,order] = 混淆垫(pred,testLabel)

显示混淆矩阵和类顺序

score 0 · Accepted Answer

问题出在

[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');

p不包含预测的标签，它具有标签正确的概率估计。LIBSVM svmpredict已经为您正确计算了准确度，这就是它在调试输出中显示 100% 的原因。修复很简单：

[p,~,~] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');

根据 LIBSVM 的 Matlab 绑定自述文件：

The function 'svmpredict' has three outputs. The first one,
predictd_label, is a vector of predicted labels. The second output,
accuracy, is a vector including accuracy (for classification), mean
squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability
estimates (if '-b 1' is specified). If k is the number of classes
in training data, for decision values, each row includes results of 
predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
special case. Decision value +1 is returned for each testing instance,
instead of an empty vector. For probabilities, each row contains k values
indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field
in the model structure.

score 0 · Accepted Answer

我很遗憾地告诉你所有的答案都是完全错误的！！代码中的主要错误是：

numLabels = max(labels);

因为它返回 (1)，尽管如果标签是正数它应该返回 2，然后 svmtrain/svmpredict 将循环两次。

无论如何，将线路更改labels=[-1;-1;-1;-1;-1;1;1;1]; 为labels=[2;2;2;2;2;1;1;1]; ，它将成功运行；）

matlab - 在 matlab 中测试 libsvm 时结果不佳

5 回答 5

Related

Reference