我想知道为什么 DALEXmodel_performance
并collect_metrics
没有提供同样的精度。他们是否使用不同的措施或不同的方法?我编译了以下示例代码:
library(tidymodels)
library(parsnip)
library(DALEXtra)
set.seed(1)
x1 <- rbinom(1000, 5, .1)
x2 <- rbinom(1000, 5, .4)
x3 <- rbinom(1000, 5, .9)
x4 <- rbinom(1000, 5, .6)
id <- c(1:1000)
y <- as.factor(rbinom(1000, 5, .5))
df <- tibble(y, x1, x2, x3, x4, id)
# create training and test set
set.seed(20)
split_dat <- initial_split(df, prop = 0.8)
train <- training(split_dat)
test <- testing(split_dat)
# use cross-validation
kfolds <- vfold_cv(df)
# recipe
rec_pca <- recipe(y ~ ., data = train) %>%
update_role(id, new_role = "id variable") %>%
step_center(all_predictors()) %>%
step_scale(all_predictors()) %>%
step_pca(x1, x2, x3, threshold = 0.9, num_comp = 1)
# parsnip engine
boost_model <- boost_tree() %>%
set_mode("classification") %>%
set_engine("xgboost")
# create wf
boosted_wf <-
workflow() %>%
add_model(boost_model) %>%
add_recipe(rec_pca)
boosted_res <- last_fit(boosted_wf, split_dat)
collect_metrics(boosted_res)
的输出collect_metrics
为 0.31
# A tibble: 2 × 4
.metric .estimator .estimate .config
<chr> <chr> <dbl> <chr>
1 accuracy multiclass 0.31 Preprocessor1_Model1
2 roc_auc hand_till 0.512 Preprocessor1_Model1
继续准备DALEX模型说明。
final_boosted <- generics::fit(boosted_wf, df)
# create an explanation object
explainer_xgb <- DALEXtra::explain_tidymodels(final_boosted,
data = df[,-1],
y = df$y)
perf <- model_performance(explainer_xgb)
perf
现在,这为整体拟合提供了以下输出:
Measures for: multiclass
micro_F1 : 0.43
macro_F1 : 0.5743392
w_macro_F1 : 0.4775901
accuracy : 0.43
w_macro_auc: 0.7064296
请注意,准确度是 0.43 usingmodel_performance
和 0.31 using collect_metrics
。有谁知道为什么会这样?