-1

我正在尝试将单个特征向量(即 X_train[i])存储到数组 X 中,并将其对应的标签存储在另一个数组 Y 中。当我尝试拟合这两个数组时,出现错误 ValueError: setting an array element with a sequence . 如何修复此错误。提前致谢。

from sklearn.datasets import load_svmlight_file                                           
pathToTrainData="/Users/rkasat/Documents/final year project/scripts/Drydata/leaf/train_backup.txt"

X_train,Y_train= load_svmlight_file(pathToTrainData);
X= []    
y=[]
for i in range(5):
    X.append(X_train[i])
    y.append(Y_train[i])

print(type(X[0]),type(y[0]))
from sklearn import svm
clf = svm.SVC(kernel='linear')
clf.fit(X,y)

output:
--------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-cd4b481af30a> in <module>()
      8 from sklearn import svm
      9 clf = svm.SVC(kernel='linear')
---> 10 clf.fit(X,y)

/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/svm/base.pyc in fit(self, X, y, sample_weight)
    137                              "by not using the ``sparse`` parameter")
    138 
--> 139         X = atleast2d_or_csr(X, dtype=np.float64, order='C')
    140         y = self._validate_targets(y)
    141 

/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in atleast2d_or_csr(X, dtype, order, copy, force_all_finite)
    132     """
    133     return _atleast2d_or_sparse(X, dtype, order, copy, sparse.csr_matrix,
--> 134                                 "tocsr", force_all_finite)
    135 
    136 

/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in _atleast2d_or_sparse(X, dtype, order, copy, sparse_class, convmethod, force_all_finite)
    109     else:
    110         X = array2d(X, dtype=dtype, order=order, copy=copy,
--> 111                     force_all_finite=force_all_finite)
    112         if force_all_finite:
    113             _assert_all_finite(X)

/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in array2d(X, dtype, order, copy, force_all_finite)
     89         raise TypeError('A sparse matrix was passed, but dense data '
     90                         'is required. Use X.toarray() to convert to dense.')
---> 91     X_2d = np.asarray(np.atleast_2d(X), dtype=dtype, order=order)
     92     if force_all_finite:
     93         _assert_all_finite(X_2d)

/Users/rkasat/anaconda/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
    318 
    319     """
--> 320     return array(a, dtype, copy=False, order=order)
    321 
    322 def asanyarray(a, dtype=None, order=None):

ValueError: setting an array element with a sequence.

(<class 'scipy.sparse.csr.csr_matrix'>, <type 'numpy.float64'>)
4

2 回答 2

2

可能您不必在代码中使用 for 循环。以下代码可能会执行您想要执行的操作:

X_train, Y_train = load_svmlight_file(pathToTrainData);

from sklearn import svm
clf = svm.SVC(kernel='linear')
clf.fit(X[:5, :],y[:5])
于 2014-04-23T17:25:33.467 回答
1

@tanemaki 是对的,但值得解释为什么这可以解决问题。X_train是(很可能)一个numpy数组。用整数 ( X_train[i]) 对其进行切片会返回整个i第 - 行。X最终成为一个 numpy 数组列表。该fit方法需要一个单一的矩阵。如果你只想训练前 5 行,你应该像 @tanemaki 已经演示过的那样切片:X[:5, :]y[:5, :]

于 2014-04-23T17:37:03.150 回答