12

我使用R 中的kernlabksvm包来预测概率,使用. 但是,我发现有时使用不会产生由 给出的最高概率的类。type="probabilities"predict.ksvmpredict(model,observation,type="r")predict(model,observation,type="p")

例子:

> predict(model,observation,type="r")
[1] A
Levels: A B
> predict(model,observation,type="p")
        A    B
[1,] 0.21 0.79

这是正确的行为还是错误?如果这是正确的行为,我如何从概率中估计最可能的类别?


尝试可重现的示例:

library(kernlab)
set.seed(1000)
# Generate fake data
n <- 1000
x <- rnorm(n)
p <- 1 / (1 + exp(-10*x))
y <- factor(rbinom(n, 1, p))
dat <- data.frame(x, y)
tmp <- split(dat, dat$y)
# Create unequal sizes in the groups (helps illustrate the problem)
newdat <- rbind(tmp[[1]][1:100,], tmp[[2]][1:10,])
# Fit the model using radial kernal (default)
out <- ksvm(y ~ x, data = newdat, prob.model = T)
# Create some testing points near the boundary

testdat <- data.frame(x = seq(.09, .12, .01))
# Get predictions using both methods
responsepreds <- predict(out, newdata = testdat, type = "r")
probpreds <- predict(out, testdat, type = "p")

results <- data.frame(x = testdat, 
                      response = responsepreds, 
                      P.x.0 = probpreds[,1], 
                      P.x.1 = probpreds[,2])

结果输出:

> results
     x response     P.x.0     P.x.1
1 0.09        0 0.7199018 0.2800982
2 0.10        0 0.6988079 0.3011921
3 0.11        1 0.6824685 0.3175315
4 0.12        1 0.6717304 0.3282696
4

1 回答 1

14

如果您查看决策矩阵和投票,它们似乎更符合响应:

> predict(out, newdata = testdat, type = "response")
[1] 0 0 1 1
Levels: 0 1
> predict(out, newdata = testdat, type = "decision")
            [,1]
[1,] -0.07077917
[2,] -0.01762016
[3,]  0.02210974
[4,]  0.04762563
> predict(out, newdata = testdat, type = "votes")
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    0    0    1    1
> predict(out, newdata = testdat, type = "prob")
             0         1
[1,] 0.7198132 0.2801868
[2,] 0.6987129 0.3012871
[3,] 0.6823679 0.3176321
[4,] 0.6716249 0.3283751

kernlab帮助页面 ( ?predict.ksvm) 链接到由 TF Wu、CJ Lin 和 RC Weng 撰写的通过 Pairwise Coupling 进行多类分类的概率估计论文。

在第 7.3 节中,据说决策和概率可以不同:

...我们解释了为什么基于概率和基于决策值的方法的结果可以如此不同。对于某些问题,δDV 选择的参数与其他五个规则选择的参数有很大的不同。在波形中,在某些参数下,所有基于概率的方法都比 δDV 提供更高的交叉验证精度。例如,我们观察到,对于两个类别的数据,验证集的决策值在 [0.73, 0.97] 和 [0.93, 1.02] 之间;因此,验证集中的所有数据都归为一类,并且误差很高。相反,基于概率的方法通过 sigmoid 函数拟合决策值,通过在 0.95 左右的决策值处进行切割,可以更好地分离两个类别。这一观察结果揭示了基于概率的方法和基于决策值的方法之间的区别。

I'm not familiar enough with these methods to understand the issue, but maybe you do, It looks like that there is distinct methods for predicting with probabilities and some other method, and the type=response corresponds to different method than the one which is used for prediction probabilities.

于 2013-03-28T16:53:10.443 回答