0

我有以下框架用于将随机森林拟合到我的数据集

set.seed(123)
split <- initial_split(data_num, prop = 0.8, strata = positive)
train_data <- training(split)
test_data <- testing(split)



rf_rec <- recipe(positive ~., data = train_data) %>%
  step_upsample(positive, over_ratio =  1)

rf_prep <- prep(rf_rec)
juiced <- juice(rf_prep)

juiced <- janitor::clean_names(juiced)
test_data <- janitor::clean_names(test_data)

X <- juiced[which(names(juiced) != "positive")]
predictor <- Predictor$new(model, data = X, y = juiced$positive)

在我的数据集上运行 SHAPLEY 时出现以下错误。

shapley <- Shapley$new(predictor, x.interest = X[1, ])
Error in colMeans(self$predictor$predict(private$sampler$get.x())) : 'x' must be numeric

有谁知道我为什么会收到这个错误?

我使用的数据集data_num

f1 f2  f3 ... target
0  0   1      1
1  0   0      0
1  1   1      1
4

1 回答 1

1

尝试之前启动 rf 模型。这意味着:

model <- randomForest(YourY ~ ., importance=T, data = YourData)
mod_rf <- Predictor$new(model, data = X)
shapley_rf <- Shapley$new(predictor = mod_rf, x.interest = X[1, ])
于 2021-07-25T07:50:20.770 回答