python-3.x - 在 python 中使用 SMOTE 进行上采样

翻译自：https://stackoverflow.com/questions/57308037 2019-08-01T11:30:14.290

368 次

我正在尝试在 python 中使用 SMOTE 来处理高度不平衡的数据集。将数据集拆分为训练和测试后，我使用 SMOTE 生成合成样本。然后我对 SMOTE 生成的数据使用 xgboost 算法。我的模型输出是预测原始数据集的概率。但是在实施 SMOTE 之后，样本数量增加了，我如何取回原始数据集来预测概率？ 代码如下：

X_train, X_test, y_train, y_test = train_test_split(X_final, Y_final, test_size=0.1, random_state = 27)  
sm = SMOTE(random_state=27, ratio=1.0)  
X_final_sm, Y_final_sm = sm.fit_sample(X_train, y_train)  
smote_xgb = XGBClassifier().fit(X_final_sm, Y_final_sm) 
smote_pred = smote_xgb.predict(X_final_sm)  
smote_pred_prob = smote_xgb.predict_proba(X_final_sm)

python-3.x - 在 python 中使用 SMOTE 进行上采样

0 回答 0

Related

Reference