1

我曾经xgboost做过逻辑回归。我按照 中的步骤进行操作,但遇到了两个问题。数据集位于此处

首先,当我运行以下代码时:

bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param)

然后,我得到了

 [0]train-rmse:0.350006
 [1]train-rmse:0.245008
 [2]train-rmse:0.171518
 [3]train-rmse:0.120065
 [4]train-rmse:0.084049
 [5]train-rmse:0.058835
 [6]train-rmse:0.041185
 [7]train-rmse:0.028830
 [8]train-rmse:0.020182
 [9]train-rmse:0.014128
[10]train-rmse:0.009890
[11]train-rmse:0.006923
[12]train-rmse:0.004846
[13]train-rmse:0.003392
[14]train-rmse:0.002375
[15]train-rmse:0.001662
[16]train-rmse:0.001164
[17]train-rmse:0.000815
[18]train-rmse:0.000570
[19]train-rmse:0.000399
[20]train-rmse:0.000279
[21]train-rmse:0.000196
[22]train-rmse:0.000137
[23]train-rmse:0.000096
[24]train-rmse:0.000067
[25]train-rmse:0.000047
[26]train-rmse:0.000033
[27]train-rmse:0.000023
[28]train-rmse:0.000016
[29]train-rmse:0.000011
[30]train-rmse:0.000008
[31]train-rmse:0.000006
[32]train-rmse:0.000004
[33]train-rmse:0.000003
[34]train-rmse:0.000002
[35]train-rmse:0.000001
[36]train-rmse:0.000001
[37]train-rmse:0.000001
[38]train-rmse:0.000000

train-rmse终于等于0了!这正常吗?通常,我知道train-rmse不能等于 0。那么,我的问题在哪里?

二、当我跑步时

importance <- xgb.importance(sparse_matrix@Dimnames[[2]], model = bst)

然后,我得到一个错误:

eval 中的错误(expr,envir,enclos):找不到对象“是”。

我不知道这是什么意思,也许第一个问题会导致第二个问题。

library(data.table)
train_x<-fread("train_x.csv")
str(train_x)
train_y<-fread("train_y.csv")
str(train_y)
train<-merge(train_y,train_x,by="uid")
train$uid<-NULL
test<-fread("test_x.csv")
require(xgboost)
require(Matrix)
sparse_matrix <- sparse.model.matrix(y~.-1, data = train)
head(sparse_matrix)
output_vector = train[,y] == "Marked"
param <- list(objective = "binary:logistic", booster = "gblinear",
          nthread = 2, alpha = 0.0001,max.depth = 4,eta=1,lambda = 1)
bst <- xgboost(data = sparse_matrix, label = output_vector,nrounds = 39,param)
importance <- xgb.importance(sparse_matrix@Dimnames[[2]], model = bst)
4

2 回答 2

1

我遇到了同样的问题(eval(expr,envir,enclos)中的错误:找不到对象'是'。)原因如下:

我试着做

dt = data.table(x = runif(10), y = 1:10, z = 1:10)
label = as.logical(dt$z)
train = dt[, z := NULL]
trainAsMatrix = as.matrix(train)
label = as.matrix(label)

bst <- xgboost(data = trainAsMatrix, label = label, max.depth = 8,
               eta = 0.3, nthread = 2, nround = 50, objective = "reg:linear")
bst$featureNames = names(train)
xgb.importance(model = bst)

问题来自于线路

label = as.logical(dt$z)

我把这条线放在那里是因为我上次使用 xgboost 时,我想预测一个分类变量。现在因为我想做回归,它应该是:

label = dt$z

也许类似的事情会导致您遇到问题?

于 2016-09-02T14:14:38.903 回答
1

也许这有任何帮助。当标签的变化为零时,我经常会遇到同样的错误。使用 xgboost 的当前 CRAN 版本,它已经有点旧(0.4.4)。xgb.train 很乐意接受这一点(显示 0.50 AUC),但随后在调用 xgb.importance 时会显示错误。

干杯

奥托

[0] train-auc:0.500000  validate-auc:0.500000
[1] train-auc:0.500000  validate-auc:0.500000
[2] train-auc:0.500000  validate-auc:0.500000
[3] train-auc:0.500000  validate-auc:0.500000
[4] train-auc:0.500000  validate-auc:0.500000

[1] "XGB error: Error in eval(expr, envir, enclos): object 'Yes' not found\n"
于 2016-11-28T11:56:05.353 回答