python - 可以使用 scikit-learn 中的预计算内核从 SVM 制作 ROC 图吗？

Question

我正在使用此示例从 SVM 分类结果创建 ROC 图：http: //scikit-learn.org/0.13/auto_examples/plot_roc.html

然而，每个数据点实际上由 4 个长度为 d 的特征向量组成，使用不符合特定 K(X, X) 范式的自定义核函数组合。因此，我必须为 scikit-learn 提供一个预先计算好的内核来进行分类。它看起来像这样：

K = numpy.zeros(shape = (n, n))

# w1 + w2 + w3 + w4 = 1.0

# v1: array, shape (n, d)
# w1: float in [0, 1)
chi = sklearn.metrics.pairwise.chi2_kernel(v1, v1)
mu = 1.0 / numpy.mean(chi)
K += w1 * numpy.exp(-mu * chi)

# v2: array, shape (n, d)
# w2: float in [0, 1)
chi = sklearn.metrics.pairwise.chi2_kernel(v2, v2)
mu = 1.0 / numpy.mean(chi)
K += w2 * numpy.exp(-mu * chi)

# v3: array, shape (n, d)
# w3: float in [0, 1)
chi = sklearn.metrics.pairwise.chi2_kernel(v3, v3)
mu = 1.0 / numpy.mean(chi)
K += w3 * numpy.exp(-mu * chi)

# v4: array, shape (n, d)
# w4: float in [0, 1)
chi = sklearn.metrics.pairwise.chi2_kernel(v4, v4)
mu = 1.0 / numpy.mean(chi)
K += w4 * numpy.exp(-mu * chi)

return K

生成 ROC 图（来自上面的链接）的主要障碍似乎是将数据分成两组，然后调用predict_proba()测试集的过程。是否可以使用预先计算的内核在 scikit-learn 中执行此操作？

score 1 · Accepted Answer

简短的回答是“也许不是”。您是否尝试过以下类似的方法？

基于http://scikit-learn.org/stable/modules/svm.html上的示例，您需要以下内容：

    import numpy as np

    from sklearn import svm
    X = np.array([[0, 0], [1, 1]])
    y = [0, 1]
    clf = svm.SVC(kernel='precomputed')

    # kernel computation
    K = numpy.zeros(shape = (n, n))

    # "At the moment, the kernel values between all training vectors 
    #  and the test vectors must be provided." 
    #  according to scikit learn web page. 
    #  -- This is the problem!
    # v1: array, shape (n, d)
    # w1: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(v1, v1)
    mu = 1.0 / numpy.mean(chi)
    K += w1 * numpy.exp(-mu * chi)

    # v2: array, shape (n, d)
    # w2: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(v2, v2)
    mu = 1.0 / numpy.mean(chi)
    K += w2 * numpy.exp(-mu * chi)

    # v3: array, shape (n, d)
    # w3: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(v3, v3)
    mu = 1.0 / numpy.mean(chi)
    K += w3 * numpy.exp(-mu * chi)

    # v4: array, shape (n, d)
    # w4: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(v4, v4)
    mu = 1.0 / numpy.mean(chi)
    K += w4 * numpy.exp(-mu * chi)

    # scikit-learn is a wrapper LIBSVM and looking at the LIBSVM Readme file
    # it seems you need kernel values for test data something like this:    

    Kt = numpy.zeros(shape = (nt, n))
    # t1: array, shape (nt, d)
    # w1: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(t1, v1)
    mu = 1.0 / numpy.mean(chi)
    Kt += w1 * numpy.exp(-mu * chi)

    # v2: array, shape (n, d)
    # w2: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(t2, v2)
    mu = 1.0 / numpy.mean(chi)
    Kt += w2 * numpy.exp(-mu * chi)

    # v3: array, shape (n, d)
    # w3: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(t3, v3)
    mu = 1.0 / numpy.mean(chi)
    Kt += w3 * numpy.exp(-mu * chi)

    # v4: array, shape (n, d)
    # w4: float in [0, 1)
    chi = sklearn.metrics.pairwise.chi2_kernel(t4, v4)
    mu = 1.0 / numpy.mean(chi)
    Kt += w4 * numpy.exp(-mu * chi)

    clf.fit(K, y) 

    # predict on testing examples
    probas_ = clf.predict_proba(Kt)

从这里开始，只需复制http://scikit-learn.org/0.13/auto_examples/plot_roc.html的底部

python - 可以使用 scikit-learn 中的预计算内核从 SVM 制作 ROC 图吗？

1 回答 1

Related

Reference