2

当使用这样的东西时

clf = KNeighborsClassifier(n_neighbors=3)
clf.fit(X,y)
predictions = clf.predict_proba(X_test)

如何将预测仅限于一类?出于性能原因,这是必需的,例如,当我有数千个类时,但只对一个特定类是否具有高概率感兴趣。

4

1 回答 1

1

Sklearn 没有实现它,你将不得不编写某种包装器,例如 - 你可以extendKNeighborsClassifier和重载predict_proba方法。

根据源代码

 def predict_proba(self, X):
        """Return probability estimates for the test data X.

        Parameters
        ----------
        X : array, shape = (n_samples, n_features)
            A 2-D array representing the test points.

        Returns
        -------
        p : array of shape = [n_samples, n_classes], or a list of n_outputs
            of such arrays if n_outputs > 1.
            The class probabilities of the input samples. Classes are ordered
            by lexicographic order.
        """
        X = atleast2d_or_csr(X)

        neigh_dist, neigh_ind = self.kneighbors(X)

        classes_ = self.classes_
        _y = self._y
        if not self.outputs_2d_:
            _y = self._y.reshape((-1, 1))
            classes_ = [self.classes_]

        n_samples = X.shape[0]

        weights = _get_weights(neigh_dist, self.weights)
        if weights is None:
            weights = np.ones_like(neigh_ind)

        all_rows = np.arange(X.shape[0])
        probabilities = []
        for k, classes_k in enumerate(classes_):
            pred_labels = _y[:, k][neigh_ind]
            proba_k = np.zeros((n_samples, classes_k.size))

            # a simple ':' index doesn't work right
            for i, idx in enumerate(pred_labels.T):  # loop is O(n_neighbors)
                proba_k[all_rows, idx] += weights[:, i]

            # normalize 'votes' into real [0,1] probabilities
            normalizer = proba_k.sum(axis=1)[:, np.newaxis]
            normalizer[normalizer == 0.0] = 1.0
            proba_k /= normalizer

            probabilities.append(proba_k)

        if not self.outputs_2d_:
            probabilities = probabilities[0]

        return probabilities

只需修改代码,将for k, classes_k in enumerate(classes_):循环更改为您需要的一个特定类的分类。

一种人为的方法是覆盖classes_变量,使其成为考虑类的单例,并在完成后恢复它。

于 2013-08-21T18:01:37.817 回答