0

将由食谱转换的列outcome(在本例中为)转换回的最优雅方法是什么?mpg解决方案可以是通用的(如果存在或仅用于lognormalize步骤(如下所示)。

可能有用的链接:这里
讨论了通用解决方案,但我认为它尚未实施。这里提供 了 R 函数的解决方案,但我不确定在这种情况下是否可以提供帮助。
scale

library(recipes)

data <- tibble(mtcars) %>% 
    select(cyl, mpg)

rec <- recipe(mpg ~ ., data = data) %>%
    step_log(all_numeric()) %>%
    step_normalize(all_numeric()) %>%
    prep()

data_baked <- bake(rec, new_data = data)

# model fitting, predictions, etc...

# how to invert/transform back predictions (estimates) and true outcomes

4

1 回答 1

2

从配方转换中获取您需要的任何值的方法是返回tidy()配方,然后使用 dplyr 动词来获取您需要的内容。

library(recipes)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> 
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#> 
#>     step

data <- tibble(mtcars) %>% 
  select(cyl, mpg)

rec <- recipe(mpg ~ ., data = data) %>%
  step_log(all_numeric()) %>%
  step_normalize(all_numeric(), id = "normalize_num") %>%
  prep()

两种方法可以退出配方步骤,然后您可以tidy()使用参数:

## notice that you can identify steps by `number` or `id`
tidy(rec)
#> # A tibble: 2 x 6
#>   number operation type      trained skip  id           
#>    <int> <chr>     <chr>     <lgl>   <lgl> <chr>        
#> 1      1 step      log       TRUE    FALSE log_LYuaY    
#> 2      2 step      normalize TRUE    FALSE normalize_num

## choose by number
tidy(rec, number = 1)
#> # A tibble: 2 x 3
#>   terms  base id       
#>   <chr> <dbl> <chr>    
#> 1 cyl    2.72 log_LYuaY
#> 2 mpg    2.72 log_LYuaY
## choose by id, which we set above (otherwise it has random id like log)
tidy(rec, id = "normalize_num")
#> # A tibble: 4 x 4
#>   terms statistic value id           
#>   <chr> <chr>     <dbl> <chr>        
#> 1 cyl   mean      1.78  normalize_num
#> 2 mpg   mean      2.96  normalize_num
#> 3 cyl   sd        0.309 normalize_num
#> 4 mpg   sd        0.298 normalize_num

一旦我们知道我们想要哪一步,我们就可以使用 dplyr 动词来准确得出我们想要转换回的值,比如mpg.

## extract out value
tidy(rec, id = "normalize_num") %>%
  filter(terms == "mpg", statistic == "mean") %>%
  pull(value)
#>      mpg 
#> 2.957514

reprex 包于 2021-01-25 创建(v0.3.0)

于 2021-01-26T01:30:51.323 回答