我正在使用 SMOTE-NC 对分类数据进行过采样。我只有 1 个功能和 10500 个样本。
运行以下代码时,我收到错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-151-a261c423a6d8> in <module>()
16 print(X_new.shape) # (10500, 1)
17 print(X_new)
---> 18 sm.fit_sample(X_new, Y_new)
~\AppData\Local\Continuum\Miniconda3\envs\data-science\lib\site-packages\imblearn\base.py in fit_resample(self, X, y)
81 )
82
---> 83 output = self._fit_resample(X, y)
84
85 y_ = (label_binarize(output[1], np.unique(y))
~\AppData\Local\Continuum\Miniconda3\envs\data-science\lib\site-packages\imblearn\over_sampling\_smote.py in _fit_resample(self, X, y)
926
927 X_continuous = X[:, self.continuous_features_]
--> 928 X_continuous = check_array(X_continuous, accept_sparse=["csr", "csc"])
929 X_minority = _safe_indexing(
930 X_continuous, np.flatnonzero(y == class_minority)
~\AppData\Local\Continuum\Miniconda3\envs\data-science\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
592 " a minimum of %d is required%s."
593 % (n_features, array.shape, ensure_min_features,
--> 594 context))
595
596 if warn_on_dtype and dtype_orig is not None and array.dtype != dtype_orig:
ValueError: Found array with 0 feature(s) (shape=(10500, 0)) while a minimum of 1 is required.
代码:
from imblearn.over_sampling import SMOTE
from imblearn.over_sampling import SMOTENC
sm = SMOTENC(random_state=27,categorical_features=[0,])
X_new = np.array(X_train.values.tolist())
Y_new = np.array(y_train.values.tolist())
print(X_new.shape) # (10500,)
print(Y_new.shape) # (10500,)
X_new = np.reshape(X_new, (-1, 1)) # SMOTE require 2-D Array, Hence changing the shape of X_mew
print(X_new.shape) # (10500, 1)
print(X_new)
sm.fit_sample(X_new, Y_new)
如果我理解正确,形状X_new
应该是 (n_samples, n_features),即 10500 X 1。我不知道为什么在 ValueError 中将其视为 shape=(10500,0)
有人可以在这里帮助我吗?