0

我试图按照

链接 1 - 稀疏矩阵 https://www.tidyverse.org/blog/2020/11/tidymodels-sparse-support/

链接 2 - Workflow_sets https://www.tmwr.org/workflow-sets.html

我在将蓝图纳入工作流程集时遇到了麻烦。

在链接 2 中定义 workflow_set 的示例中

no_pre_proc <- 
   workflow_set(
      preproc = list(simple = model_vars), 
      models = list(MARS = mars_spec, CART = cart_spec, CART_bagged = bag_cart_spec,
                    RF = rf_spec, boosting = xgb_spec, Cubist = cubist_spec)
   )

以及我们在链接 1 中将蓝图添加到工作流程中的方式

wf_sparse <- 
  workflow() %>%
  add_recipe(text_rec, blueprint = sparse_bp) %>%
  add_model(lasso_spec)
  
wf_default <- 
  workflow() %>%
  add_recipe(text_rec) %>%
  add_model(lasso_spec)

我在哪里以及如何在上面的 workflow_set 中添加“blueprint = sparse_bp”选项?

我的尝试是

no_pre_proc <- 
   workflow_set(
      preproc = list(simple = model_vars), 
      models = list(MARS = mars_spec, CART = cart_spec, CART_bagged = bag_cart_spec,
                    RF = rf_spec, boosting = xgb_spec, Cubist = cubist_spec)) %>% 
  option_add(update_blueprint(blueprint = sparse_bp))

运行赛车曲子给了我这个错误

Error: Problem with `mutate()` column `option`.
i `option = purrr::map(option, append_options, dots)`.
x All options should be named.
Run `rlang::last_error()` to see where the error occurred

<error/rlang_error>
There were 9 workflows that had no results.
Backtrace:
 1. ggplot2::autoplot(...)
 2. workflowsets:::autoplot.workflow_set(...)
 3. workflowsets:::rank_plot(...)
 4. workflowsets:::pick_metric(object, rank_metric, metric)
 6. workflowsets:::collect_metrics.workflow_set(x)
 7. workflowsets:::check_incompete(x, fail = TRUE)
 8. workflowsets:::halt(msg)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/rlang_error>
There were 9 workflows that had no results.
Backtrace:
    x
 1. +-ggplot2::autoplot(...)
 2. \-workflowsets:::autoplot.workflow_set(...)
 3.   \-workflowsets:::rank_plot(...)
 4.     \-workflowsets:::pick_metric(object, rank_metric, metric)
 5.       +-tune::collect_metrics(x)
 6.       \-workflowsets:::collect_metrics.workflow_set(x)
 7.         \-workflowsets:::check_incompete(x, fail = TRUE)
 8.           \-workflowsets:::halt(msg)
> 

谢谢,

4

1 回答 1

2

感谢您提出这个问题;我们现在绝对不能很好地支持这个用例(将非默认参数传递给配方或模型)。我们在这里打开了一个问题,您可以在其中跟踪我们在这方面的工作。

与此同时,您可以通过手动使用update_recipe()您感兴趣的工作流程来尝试一些 hacky 解决方法:

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip

data(parabolic)
set.seed(1)
split <- initial_split(parabolic)
train_set <- training(split)
test_set <- testing(split)

glmnet_spec <- 
  logistic_reg(penalty = 0.1, mixture = 0) %>%
  set_engine("glmnet")

rec <-
  recipe(class ~ ., data = train_set) %>%
  step_YeoJohnson(all_numeric_predictors())

sparse_bp <- hardhat::default_recipe_blueprint(composition = "dgCMatrix")

wfs_orig <-
  workflow_set(
    preproc = list(yj = rec, 
                   norm = rec %>% step_normalize(all_numeric_predictors())),
    models = list(regularized = glmnet_spec)
  ) 

new_wf <- 
  wfs_orig %>% 
  extract_workflow("yj_regularized") %>% 
  update_recipe(rec, blueprint = sparse_bp)

reprex 包于 2021-12-09 创建(v2.0.1)

然后(我知道这现在感觉很老套)手动将new_wf其插入wfs_orig$info[[1]]$workflow插槽以替换那里的内容。

于 2021-12-09T17:54:21.930 回答