1

我正在尝试在 scikit-learn 上进行第一个练习,但即使我运行他们的解决方案代码(如下所示),我也会在紧随其后的代码块中收到错误。有谁知道为什么会这样?我该如何解决这个问题?

尝试使用此数据集时,预测方法也失败了,由于某种原因,使用问题最底部的代码,它似乎对 iris 数据集工作正常。抱歉,如果我遗漏了一些非常明显的东西,我不是真正的程序员。

Traceback (most recent call last):
  File "C:\Users\user2491873\Desktop\scikit_exercise.py", line 30, in <module>
    print(knn.fit(X_train, y_train).score(X_test, y_test))
  File "C:\Python33\lib\site-packages\sklearn\base.py", line 279, in score
    return accuracy_score(y, self.predict(X))
  File "C:\Python33\lib\site-packages\sklearn\neighbors\classification.py", line 131,     in predict
    neigh_dist, neigh_ind = self.kneighbors(X)
  File "C:\Python33\lib\site-packages\sklearn\neighbors\base.py", line 254, in kneighbors
warn_equidistant()
  File "C:\Python33\lib\site-packages\sklearn\neighbors\base.py", line 33, in warn_equidistant
    warnings.warn(msg, NeighborsWarning, stacklevel=3)
  File "C:\Python33\lib\idlelib\PyShell.py", line 59, in idle_showwarning
file.write(warnings.formatwarning(message, category, filename,
AttributeError: 'NoneType' object has no attribute 'write'

这是代码:

"""
================================
Digits Classification Exercise
================================

This exercise is used in the :ref:`clf_tut` part of the
:ref:`supervised_learning_tut` section of the
:ref:`stat_learn_tut_index`.
"""

from sklearn import datasets, neighbors, linear_model

digits = datasets.load_digits()
X_digits = digits.data
y_digits = digits.target

n_samples = len(X_digits)

X_train = X_digits[:.9 * n_samples]
y_train = y_digits[:.9 * n_samples]
X_test = X_digits[.9 * n_samples:]
y_test = y_digits[.9 * n_samples:]

knn = neighbors.KNeighborsClassifier()
logistic = linear_model.LogisticRegression()

print('KNN score: %f' % knn.fit(X_train, y_train).score(X_test, y_test))\
print('LogisticRegression score: %f'
      % logistic.fit(X_train, y_train).score(X_test, y_test))

这是 Iris 数据集的代码,它似乎工作正常......

import numpy as np
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> iris_X = iris.data
>>> iris_y = iris.target
>>> np.unique(iris_y)
array([0, 1, 2])

>>> # Split iris data in train and test data
>>> # A random permutation, to split the data randomly
>>> np.random.seed(0)
>>> indices = np.random.permutation(len(iris_X))
>>> iris_X_train = iris_X[indices[:-10]]
>>> iris_y_train = iris_y[indices[:-10]]
>>> iris_X_test  = iris_X[indices[-10:]]
>>> iris_y_test  = iris_y[indices[-10:]]
>>> # Create and fit a nearest-neighbor classifier
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn = KNeighborsClassifier()
>>> knn.fit(iris_X_train, iris_y_train)
KNeighborsClassifier(algorithm='auto', leaf_size=30, n_neighbors=5, p=2,
           warn_on_equidistant=True, weights='uniform')
>>> knn.predict(iris_X_test)
array([1, 2, 1, 0, 0, 0, 2, 1, 2, 0])
>>> iris_y_test
array([1, 1, 1, 0, 0, 0, 2, 1, 2, 0])    
4

1 回答 1

6

如果您阅读回溯消息,则意味着file表达式中的变量file.write(warnings.formatwarning(message, category, filename, ...)设置为None而不是预期的通道(例如程序的标准输出或用户界面中的缓冲区)。

这意味着这可能是 IDLE 中的一个错误。如果你用谷歌搜索错误信息,你会得到:

http://bugs.python.org/issue18030

这又指向:

http://bugs.python.org/issue13582

所以这个bug确实和scikit-learn无关。我建议你:

  • 通过键入从cmd控制台启动 IDLEpython -m idlelib.idle

  • 或使用不同的 Python IDE/环境。

于 2013-06-17T08:14:39.207 回答