0

问题:

我正在关注 Julia Silge(此处链接)关于使用 tidymodels 和食谱的教程。我可以毫无问题地完成大部分工作,但是当我调用该fit_resamples()函数时,我得到了错误:Error: The first argument to [fit_resamples()] should be either a model or workflow.

我正在逐个字符地复制教程中的代码,一切运行良好,包括打印出来validation_splits。但是,一旦我打电话,fit_resamples()我就会收到上面的错误(链接到教程的相关部分)。如果有用,则输出rlang::last_error()为:

<error/rlang_error>

The first argument to [fit_resamples()] should be either a model or workflow.
Backtrace:
 
     1. tune::fit_resamples(...)
     2. tune:::fit_resamples.default(...)

有谁知道这里发生了什么?我该如何解决?我的理解是,我传递给的第一个参数fit_resamples() 一个模型,即character ~ .,并且我已经将这个相同的模型传递给脚本前面的其他函数而没有问题。请参阅下面的代码(和数据)导致我的机器上的错误,以及我的 sessionInfo()。

可重现的例子:

library(tidyverse)

## Bring in data
hotels <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-02-11/hotels.csv')

hotel_stays <- hotels %>% 
  filter(is_canceled == 0) %>% 
  mutate(children = case_when(children + babies > 0 ~ 'children',
                              TRUE ~ 'none'),
         required_car_parking_spaces = case_when(required_car_parking_spaces > 0 ~ 'parking', 
                                                 TRUE ~ 'none')) %>% 
  select(-is_canceled, -reservation_status, -babies)

hotels_df <- hotel_stays %>% 
  select(children, hotel, arrival_date_month, meal, adr, adults, 
         required_car_parking_spaces, total_of_special_requests, 
         stays_in_week_nights, stays_in_weekend_nights) %>% 
  mutate_if(is.character, factor)

## Build models
library(tidymodels)

set.seed(1234)
hotel_split <- initial_split(hotels_df)
hotel_train <- training(hotel_split)
hotel_test <- testing(hotel_split)

hotel_rec <- recipe(children ~ ., data = hotel_train) %>% 
  step_downsample(children) %>% 
  step_dummy(all_nominal(), -all_outcomes()) %>% 
  step_zv(all_numeric()) %>% 
  step_normalize(all_numeric()) %>% 
  prep()

test_proc <- bake(hotel_rec, new_data = hotel_test)

knn_spec <- nearest_neighbor() %>% 
  set_engine('kknn') %>% 
  set_mode('classification')
knn_fit <- knn_spec %>% 
  fit(children ~ ., 
      data=juice(hotel_rec))
knn_fit

## Evaluate models
set.seed(1234)
validation_splits <- mc_cv(juice(hotel_rec), prop = 0.9, strata = children)
validation_splits

## This is where I get the error
knn_res <- fit_resamples(
  children ~ ., 
  knn_spec,
  validation_splits,
  control = control_resamples(save_pred = TRUE)
)

我的sessionInfo()

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] GGally_2.1.2.9000  skimr_2.1.3        silgelib_0.1.1     forcats_0.5.1     
 [5] stringr_1.4.0      readr_1.4.0        tidyverse_1.3.1    knitr_1.33        
 [9] yardstick_0.0.8    workflowsets_0.0.2 workflows_0.2.2    tune_0.1.5        
[13] tidyr_1.1.3        tibble_3.1.2       rsample_0.1.0      recipes_0.1.16    
[17] purrr_0.3.4        parsnip_0.1.6      modeldata_0.1.0    infer_0.5.4       
[21] ggplot2_3.3.5      dplyr_1.0.7        dials_0.0.9        scales_1.1.1      
[25] broom_0.7.6        tidymodels_0.1.3  

loaded via a namespace (and not attached):
 [1] colorspace_2.0-1   ellipsis_0.3.2     class_7.3-19       base64enc_0.1-3   
 [5] fs_1.5.0           rstudioapi_0.13    listenv_0.8.0      furrr_0.2.3       
 [9] farver_2.1.0       prodlim_2019.11.13 fansi_0.5.0        lubridate_1.7.10  
[13] xml2_1.3.2         codetools_0.2-18   splines_4.1.0      jsonlite_1.7.2    
[17] pROC_1.17.0.1      dbplyr_2.1.1       shiny_1.6.0        compiler_4.1.0    
[21] httr_1.4.2         backports_1.2.1    assertthat_0.2.1   Matrix_1.3-3      
[25] fastmap_1.1.0      cli_2.5.0          later_1.2.0        htmltools_0.5.1.1 
[29] prettyunits_1.1.1  tools_4.1.0        igraph_1.2.6       gtable_0.3.0      
[33] glue_1.4.2         Rcpp_1.0.6         cellranger_1.1.0   DiceDesign_1.9    
[37] vctrs_0.3.8        iterators_1.0.13   timeDate_3043.102  gower_0.2.2       
[41] xfun_0.23          globals_0.14.0     rvest_1.0.0        mime_0.10         
[45] lifecycle_1.0.0    kknn_1.3.1         future_1.21.0      MASS_7.3-54       
[49] ipred_0.9-11       hms_1.1.0          promises_1.2.0.1   parallel_4.1.0    
[53] RColorBrewer_1.1-2 yaml_2.2.1         curl_4.3.1         rpart_4.1-15      
[57] reshape_0.8.8      stringi_1.6.2      foreach_1.5.1      lhs_1.1.1         
[61] lava_1.6.9         repr_1.1.3         rlang_0.4.11       pkgconfig_2.0.3   
[65] evaluate_0.14      lattice_0.20-44    htmlwidgets_1.5.3  labeling_0.4.2    
[69] tidyselect_1.1.1   parallelly_1.26.0  plyr_1.8.6         magrittr_2.0.1    
[73] R6_2.5.0           generics_0.1.0     DBI_1.1.1          pillar_1.6.1      
[77] haven_2.4.1        withr_2.4.2        survival_3.2-11    nnet_7.3-16       
[81] modelr_0.1.8       crayon_1.4.1       utf8_1.2.1         rmarkdown_2.8     
[85] progress_1.2.2     grid_4.1.0         readxl_1.3.1       reprex_2.0.0      
[89] digest_0.6.27      xtable_1.8-4       httpuv_1.6.1       GPfit_1.0-8       
[93] munsell_0.5.0 
4

1 回答 1

1

您正在查看的博客文章相当陈旧,并且有一段时间需要调整,因此您现在应该首先放置工作流或模型。因此出现错误消息:

[fit_resamples()] 的第一个参数应该是模型或工作流。

解决方法是将您的模型或工作流程作为第一个参数,如下所示:

knn_res <- fit_resamples(
  knn_spec,
  children ~ ., 
  validation_splits,
  control = control_resamples(save_pred = TRUE)
)
于 2021-06-26T16:03:09.087 回答