python - 如何修复numpy数组的错误维度

Question

我正在使用监督机器学习解决二值图像分类问题。我使用了 svm 分类器算法。首先，我为变量 X 中的标准化彩色图像创建了一个 numpy 数组，其形状为 (17500,32,32,3)。然后数据拆分后，X_train 的形状为 (14000,32,32,3)，维度为 4，y_train 的形状为 (14000,2)，维度为 2。

clf.fit(X_train,y_train)

运行此代码后，我得到一个值错误：发现维度 4 估计器的数组具有维度 <=2。

提前致谢！

score 2 · Accepted Answer

如果您使用的是 scikit-learn SVM 分类算法，它需要 2D 形状数组作为SVM 拟合函数(n_samples, n_features)的训练数据集。

您传入的数据集是 4D 数组，因此您需要将数组重塑为 2D 数组。

例子：

from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# To apply a classifier, we need to flatten the image, to
# turn the data in a (samples, feature) matrix, 
# assuming data is numpy array of shape (17500, 32, 32, 3), convert to shape (17500, 3072).
n_samples = len(data)
data_reshape = data.reshape((n_samples, -1))

# Split data into train and test subsets
X_train, X_test, y_train, y_test = train_test_split(data_reshape, labels, 
                                                    test_size=0.2)
clf.fit(X_train,y_train)

score 0 · Accepted Answer

该技术称为降维。将数据从高维空间映射到低维空间。最常用的技术是主成分分析（PCA）。您可以通过以下链接了解它们：

https://towardsdatascience.com/feature-selection-and-dimensionality-reduction-f488d1a035de
https://www.quora.com/What-dimensionality-reduction-methods-would-you-recommend

此链接通过一个与您的数据集相似的示例来解释减少：https ://www.datacamp.com/community/tutorials/principal-component-analysis-in-python

python - 如何修复numpy数组的错误维度

2 回答 2

Related

Reference