python - 用作索引的数组必须是整数（或布尔）类型

Question

错误是这样的：

Traceback (most recent call last):
  File "NearestCentroid.py", line 53, in <module>
    clf.fit(X_train.todense(),y_train)
  File "/usr/local/lib/python2.7/dist-packages/scikit_learn-0.13.1-py2.7-linux-i686.egg/sklearn/neighbors/nearest_centroid.py", line 115, in fit
    variance = np.array(np.power(X - self.centroids_[y], 2))
IndexError: arrays used as indices must be of integer (or boolean) type

代码是这样的：

distancemetric=['euclidean','l2']
for mtrc in distancemetric:
for shrkthrshld in [None]:
#shrkthrshld=0
#while (shrkthrshld <=1.0):
    clf = NearestCentroid(metric=mtrc,shrink_threshold=shrkthrshld)
    clf.fit(X_train.todense(),y_train)
    y_predicted = clf.predict(X_test.todense())

我正在使用scikit-learn包，，X-train是y_trainLIBSVM格式，X是特征：值对，y_train是目标/标签，X_train是CSR矩阵格式，shrink_threshold不支持CSR稀疏矩阵，所以我添加.todense()到X_train，然后我得到这个错误，可以有人帮我解决这个问题吗？非常感谢！

score 41 · Accepted Answer

我在使用 Pystruct 时遇到了类似的问题pystruct.learners.OneSlackSSVM。

发生这种情况是因为我的训练标签是浮点数，而不是整数。就我而言，这是因为我使用 np.ones 初始化了标签，而没有指定 dtype=np.int8。希望能帮助到你。

score 7 · Accepted Answer

经常发生的情况是，索引数组应该integer通过创建方式明确类型，但在传递空列表的情况下，变为 default float，程序员可能不会考虑这种情况。例如：

>>> np.array(xrange(1))
>>> array([0])                #integer type as expected
>>> np.array(xrange(0))
>>> array([], dtype=float64)  #does not generalize to the empty list

因此，应该始终dtype在数组构造函数中明确定义。

score 0 · Accepted Answer

有时您的数据是整数并且每件事都是正确的，但它发生是因为您的数据系列之一是一个空数组，因此您可以使用以下条件：

if len(X_train.todense())> 0:

python - 用作索引的数组必须是整数（或布尔）类型

3 回答 3

Related

Reference