r - 使用 R 进行分层预测

Question

我正在使用 fable 包来预测分层时间序列，并且所有节点的深度不相等。用例是在国家 -> 州 -> 地区级别预测联系人。汇总时，预测值必须与国家/地区级别相加（较低级别的预测等同于较高级别的预测。）
https://robjhyndman.com/papers/Foresight-hts-final.pdf
下面是我在预测时尝试的代码关于测试数据。

library(fable)
library(tsibble)
library(tsibbledata)
library(lubridate)
library(dplyr)

# selecting train data
train_df <- tourism %>%
  filter(year(Quarter) <= 2014 & Region %in% c("MacDonnell", "Melbourne"))

# selecting test data
test_df <- tourism %>%
  filter(year(Quarter) > 2014 & Region %in% c("MacDonnell", "Melbourne"))

# fitting ets model with reconcilliation
ets_fit <- train_df %>%
  aggregate_key(Purpose * (State / Region), Trips = sum(Trips)) %>%
  model(ets=ETS(Trips)) %>%
  reconcile(ets_adjusted = min_trace(ets))
# forecasting on test data
fcasts_test <- forecast(ets_fit, test_df)

得到错误为

Error: Provided data contains a different key structure to the models.
Run `rlang::last_error()` to see where the error occurred.

我该如何解决这个问题？

score 2 · Accepted Answer

您在拟合模型之前更改了使用的键结构aggregate_key()，因此预测键结构与测试集不匹配。使用后需要创建测试集aggregate_key()。

但是，您不能在创建聚合后按其中一个键进行过滤，因为这样聚合信息是不完整的。

这是一个可以满足您要求的示例。

library(fable)
library(tsibble)
library(tsibbledata)
library(lubridate)
library(dplyr)

# Aggregate data as required
agg_tourism <- tourism %>%
  filter(Region %in% c("MacDonnell", "Melbourne")) %>%
  aggregate_key(Purpose * (State / Region), Trips = sum(Trips))

# Select training data
train_df <- agg_tourism %>%
  filter(year(Quarter) <= 2014)

# Select test data
test_df <- agg_tourism %>%
  filter(year(Quarter) > 2014)

# Fit ets model with reconcilliation
ets_fit <- train_df %>%
  model(ets = ETS(Trips)) %>%
  reconcile(ets_adjusted = min_trace(ets))
# forecasting on test data
fcasts_test <- forecast(ets_fit, test_df)

fcasts_test
#> # A fable: 600 x 7 [1Q]
#> # Key:     Purpose, State, Region, .model [50]
#>    Purpose  State              Region     .model Quarter      Trips .mean
#>    <chr*>   <chr*>             <chr*>     <chr>    <qtr>     <dist> <dbl>
#>  1 Business Northern Territory MacDonnell ets    2015 Q1 N(5.1, 21)  5.12
#>  2 Business Northern Territory MacDonnell ets    2015 Q2 N(5.1, 21)  5.12
#>  3 Business Northern Territory MacDonnell ets    2015 Q3 N(5.1, 21)  5.12
#>  4 Business Northern Territory MacDonnell ets    2015 Q4 N(5.1, 21)  5.12
#>  5 Business Northern Territory MacDonnell ets    2016 Q1 N(5.1, 21)  5.12
#>  6 Business Northern Territory MacDonnell ets    2016 Q2 N(5.1, 21)  5.12
#>  7 Business Northern Territory MacDonnell ets    2016 Q3 N(5.1, 21)  5.12
#>  8 Business Northern Territory MacDonnell ets    2016 Q4 N(5.1, 21)  5.12
#>  9 Business Northern Territory MacDonnell ets    2017 Q1 N(5.1, 21)  5.12
#> 10 Business Northern Territory MacDonnell ets    2017 Q2 N(5.1, 21)  5.12
#> # … with 590 more rows

fcasts_test %>%
  filter(Region == "Melbourne", Purpose == "Visiting") %>%
  autoplot(agg_tourism)

^{由reprex 包（v0.3.0）于 2020 年 12 月 26 日创建}

r - 使用 R 进行分层预测

1 回答 1

Related

Reference