我正在使用 SMOTE,因为我的数据集不平衡,但我收到了如下所述的错误消息。我在这个论坛上看到过一篇关于同一主题的帖子。但是,在那篇文章中,建议这是由于重复的列名导致发生此错误。我检查了我的数据集,没有重复的列名,但我仍然收到此错误。我的数据集具有分类变量,并且都已转换为 1 和 0。
sm = SMOTE(random_state = 2)
X_train_res, y_train_res = sm.fit_resample(X_train, y_train)
这是错误消息:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-147-9ac314f8e551> in <module>
1 sm = SMOTE(random_state = 2)
----> 2 X_train_res, y_train_res = sm.fit_resample(X_train, y_train)
~\Anaconda3\lib\site-packages\imblearn\base.py in fit_resample(self, X, y)
86 if binarize_y else output[1])
87
---> 88 X_, y_ = arrays_transformer.transform(output[0], y_)
89 return (X_, y_) if len(output) == 2 else (X_, y_, output[2])
90
~\Anaconda3\lib\site-packages\imblearn\utils\_validation.py in transform(self, X, y)
38
39 def transform(self, X, y):
---> 40 X = self._transfrom_one(X, self.x_props)
41 y = self._transfrom_one(y, self.y_props)
42 return X, y
~\Anaconda3\lib\site-packages\imblearn\utils\_validation.py in
_transfrom_one(self, array, props)
57 import pandas as pd
58 ret = pd.DataFrame(array, columns=props["columns"])
---> 59 ret = ret.astype(props["dtypes"])
60 elif type_ == "series":
61 import pandas as pd
~\Anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors) 5681 if col_name in dtype: 5682 results.append(
-> 5683 col.astype(dtype=dtype[col_name], copy=copy, errors=errors) 5684 ) 5685 else:
~\Anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors) 5664 if self.ndim == 1: # i.e. Series 5665 if len(dtype) > 1 or self.name not in dtype:
-> 5666 raise KeyError( 5667 "Only the Series name can be used for " 5668 "the key in Series dtype mappings."
KeyError: 'Only the Series name can be used for the key in Series dtype mappings.'