forecasting - fabletools 中的“augment()”函数

Question

我正在尝试使用 fabletools 包提取预测残差。我知道我可以使用该augment()函数提取拟合模型残差，但我不知道它对预测值是如何工作的，并且我得到与拟合模型残差相同的结果。这是一个例子：

library(fable)
library(tsibble)
 lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))

## fitted model residuals
 lung_deaths %>%
    dplyr::filter(index < yearmonth("1979 Jan")) %>%
    model(
      ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
      augment()   
# A tsibble: 120 x 7 [1M]
# Key:       key, .model [2]
   key     .model    index value .fitted  .resid   .innov
   <chr>   <chr>     <mth> <dbl>   <dbl>   <dbl>    <dbl>
 1 fdeaths ets    1974 Jan   901    837.   64.0   0.0765 
 2 fdeaths ets    1974 Feb   689    877. -188.   -0.214  
 3 fdeaths ets    1974 Mar   827    795.   31.7   0.0399 
 4 fdeaths ets    1974 Apr   677    624.   53.2   0.0852 
 5 fdeaths ets    1974 May   522    515.    7.38  0.0144 
 6 fdeaths ets    1974 Jun   406    453.  -47.0  -0.104  
 7 fdeaths ets    1974 Jul   441    431.    9.60  0.0223 
 8 fdeaths ets    1974 Aug   393    388.    4.96  0.0128 
 9 fdeaths ets    1974 Sep   387    384.    2.57  0.00668
10 fdeaths ets    1974 Oct   582    480.  102.    0.212  
# ... with 110 more rows

## forecast residuals
test <- lung_deaths %>%
    dplyr::filter(index < yearmonth("1979 Jan")) %>%
    model(
      ets = ETS(value ~ error("M") + trend("A") + season("A"))) %>%
      forecast(h = "1 year") 
## defining newdata
  Data <- lung_deaths %>%
      dplyr::filter(index >= yearmonth("1979 Jan"))

augment(test, newdata = Data, type.predict='response')
# A tsibble: 120 x 7 [1M]
# Key:       key, .model [2]
   key     .model    index value .fitted  .resid   .innov
   <chr>   <chr>     <mth> <dbl>   <dbl>   <dbl>    <dbl>
 1 fdeaths ets    1974 Jan   901    837.   64.0   0.0765 
 2 fdeaths ets    1974 Feb   689    877. -188.   -0.214  
 3 fdeaths ets    1974 Mar   827    795.   31.7   0.0399 
 4 fdeaths ets    1974 Apr   677    624.   53.2   0.0852 
 5 fdeaths ets    1974 May   522    515.    7.38  0.0144 
 6 fdeaths ets    1974 Jun   406    453.  -47.0  -0.104  
 7 fdeaths ets    1974 Jul   441    431.    9.60  0.0223 
 8 fdeaths ets    1974 Aug   393    388.    4.96  0.0128 
 9 fdeaths ets    1974 Sep   387    384.    2.57  0.00668
10 fdeaths ets    1974 Oct   582    480.  102.    0.212  
# ... with 110 more rows

任何建议将不胜感激。

score 0 · Accepted Answer

我想你可能想要预测误差——观察到的和预测的之间的差异。有关讨论，请参见https://otexts.com/fpp3/accuracy.html。引用那一章：

请注意，预测误差在两个方面与残差不同。首先，在训练集上计算残差，而在测试集上计算预测误差。其次，残差基于一步预测，而预测误差可能涉及多步预测。

这是一些用于计算示例中的预测误差的代码。

library(fable)
library(tsibble)
library(dplyr)

lung_deaths <- as_tsibble(cbind(mdeaths, fdeaths))

## forecasts
fcast <- lung_deaths %>%
  dplyr::filter(index < yearmonth("1979 Jan")) %>%
  model(
    ets = ETS(value ~ error("M") + trend("A") + season("A"))
  ) %>%
  forecast(h = "1 year") 

## defining newdata
new_data <- lung_deaths %>%
  dplyr::filter(index >= yearmonth("1979 Jan")) %>%
  rename(actual = value)

# Compute forecast errors
fcast %>%
  left_join(new_data) %>%
  mutate(error = actual - .mean)
#> Joining, by = c("key", "index")
#> # A fable: 24 x 7 [1M]
#> # Key:     key, .model [2]
#>    key     .model    index        value .mean actual error
#>    <chr>   <chr>     <mth>       <dist> <dbl>  <dbl> <dbl>
#>  1 fdeaths ets    1979 Jan N(783, 8522)  783.    821  37.5
#>  2 fdeaths ets    1979 Feb N(823, 9412)  823.    785 -38.4
#>  3 fdeaths ets    1979 Mar N(742, 7639)  742.    727 -14.8
#>  4 fdeaths ets    1979 Apr N(570, 4516)  570.    612  41.7
#>  5 fdeaths ets    1979 May N(461, 2951)  461.    478  16.9
#>  6 fdeaths ets    1979 Jun N(400, 2216)  400.    429  29.5
#>  7 fdeaths ets    1979 Jul N(378, 1982)  378.    405  27.1
#>  8 fdeaths ets    1979 Aug N(335, 1553)  335.    379  44.5
#>  9 fdeaths ets    1979 Sep N(331, 1520)  331.    393  62.1
#> 10 fdeaths ets    1979 Oct N(427, 2527)  427.    411 -15.7
#> # … with 14 more rows

^{由reprex 包于 2020-11-03 创建(v0.3.0)}

forecasting - fabletools 中的“augment()”函数

1 回答 1

Related

Reference