0

我有一个 lightgbm 多类分类模型,我想为其创建一个混淆矩阵。第一步,我只想在 df 上绘制预测值与实际值……我的问题是 lightgbm.predict 是否会按您给它的数据集的顺序返回预测值。

如果您遵循下面的代码,我的“预测”部分是否正确地将测试数据集行与对应的预测行匹配?

这是我创建测试和训练集的方式:

# split train and test into X and Y
X_train = train_data[:,0:(model.shape[1]-2)] ; Y_train = train_data[:,model.shape[1]-1] # python starts counting at 0
X_test = test_data[:,0:(model.shape[1]-2)] ; Y_test = test_data[:,model.shape[1]-1] # python starts counting at 0

#training and eval dataset
lgb_train = lgb.Dataset(data = X_train, label = Y_train)
lgb_test = lgb.Dataset(data = X_test, label = Y_test)

运行模型:

#run model
bst_model = lgb.train(params = parameters, train_set = lgb_train, num_boost_round = 1000, 
                      valid_sets = [lgb_train,lgb_test], early_stopping_rounds = 7) 
                      #categorical_feature = categoricals_vec)

然后是预测:

#Predictions
preds = bst_model.predict(X_test)
preds_df =  pd.DataFrame(preds, columns = ['0','1','2'])
preds_df['pred'] = preds_df.idxmax(axis=1)
preds_df['actual'] = boost_data_set.iloc[0:breakpoint,boost_data_set.shape[1]-1]
4

1 回答 1

0

是的。预测是有序的。

于 2018-07-20T03:05:00.763 回答