2

我试图在一个循环中估计一系列 ARIMA 模型,每次迭代都从一个因变量列表中传入一个不同的因变量。我正在尝试使用该fable包在 R 中执行此操作。但我似乎无法将列表中的不同变量名传递到 dplyr 管道中。

我有一个看起来像这样的 tsibble:

# A tsibble: 320 x 5 [1Q]
# Key:       age, sex [4]
   quarter   age   sex     var var_log
   <yearqtr> <fct> <fct> <dbl>   <dbl>
 1 1990 Q1   18-25 male   50      3.91
 2 1990 Q2   18-25 male   49.9    3.91
 3 1990 Q3   18-25 male   51.1    3.93
 4 1990 Q4   18-25 male   52.6    3.96
 5 1991 Q1   18-25 male   52.1    3.95
 6 1991 Q2   18-25 male   51.4    3.94
 7 1991 Q3   18-25 male   52.0    3.95
 8 1991 Q4   18-25 male   51.2    3.94
 9 1992 Q1   18-25 male   50.8    3.93
10 1992 Q2   18-25 male   51.7    3.95
# ... with 310 more rows

此数据是使用以下代码生成的:

library(zoo)

set.seed(42)

quarter <- as.yearqtr(seq(as.Date("1990-01-01"), by="quarter", length.out = 80), format = "%Y-%m-%d")
age <- c('18-25', 'Over 25')
sex <- c('male', 'female')

df <- expand.grid(quarter, age, sex)
names(df) <- c('quarter', 'age', 'sex')
df$var <- NA

df[df$age=='18-25' & df$sex== 'male', ]$var <- cumsum(c(50, rnorm(n=nrow(df[df$age=='18-25' & df$sex== 'male', ])-1, mean =.1)))
df[df$age=='18-25' & df$sex== 'female', ]$var <- cumsum(c(60, rnorm(n=nrow(df[df$age=='18-25' & df$sex== 'female', ])-1, mean =.2)))
df[df$age=='Over 25' & df$sex== 'male', ]$var <- cumsum(c(50, rnorm(n=nrow(df[df$age=='Over 25' & df$sex== 'male', ])-1, mean = (-.1))))
df[df$age=='Over 25' & df$sex== 'female', ]$var <- cumsum(c(60, rnorm(n=nrow(df[df$age=='Over 25' & df$sex== 'male', ])-1, mean = (-.2))))

df$var_log <- log(df$var)

df <- as_tsibble(df, index=quarter, key=c('age', 'sex'))

我正在尝试编写一个函数,该函数将函数规范列表作为其输入,并循环遍历函数以重复估计模型,如下所示:

select <- dplyr::select

estimate_models <- 
  function(mdl_list, # A list of a list of model specifications
  {
    # This function is a single loop for estimating models. 
    for (i in 1:length(mdl_list)) {
      # Extract model information -----------------------------------------------

      mdl_name <- mdl_list[[i]][["mdl"]] # Name of model
      mdl_type <- sub("_.*","",mdl_name) # Type of model,
      mdl_vars_ari <- mdl_list[[i]][["ari"]] # Contains the dependent variables in ARIMA models


        # ARIMA model estimation --------------------------------------------------

        print(paste0("Estimating ", mdl_name, "..."))
        # Estimate the ARIMA model
        mdl_vars_ari_enquo <- enquo(mdl_vars_ari)
        mdl <- mdl_data %>%
          model(arima = ARIMA(!!mdl_vars_ari_enquo)) %>%
          forecast(h=28) %>% # Forecast 28 periods ahead
          fortify() %>% # Extracts the forecast as a dataframe
          filter(.level==95) %>% # Filter results where the confidence level is 95%
          mutate(mcv_fnl = exp(!!mdl_vars_ari_enquo), quarter = as.Date(quarter, format="%y%m%d")) %>% # Take the exponent and set type of the quarter column to 'Date'
          select(-contains(".")) %>% # Remove extra columns
          rbind(mdl_data) # Rbind fitted values to the model data
      }
    }

包含规范细节的mdl_list, 看起来像这样:

mdl_list <- list(list(mdl = "arima_model", ari = "var_log", d = "df"))

尝试运行代码时出现以下错误:

model(mdl_data, arima=ARIMA(!!mdl_vars_ari_enquo))
Warning: 4 errors (1 unique) encountered for arima
[4] Could not find an appropriate ARIMA model.

这似乎与ARIMA(!!mdl_vars_ari_enquo)解析变量名参数的方式有关。传入var_log工作正常。但是传入mdl_vars_ari不起作用,我认为是因为 dplyr 的非标准评估。

我在这里阅读了 Hadley Wickham 的指南:https ://dplyr.tidyverse.org/articles/programming.html 但两者都没有quo()enquo()似乎也没有奏效。我也尝试过as.name(),但无济于事。

如果您需要更多详细信息来回答我的问题,请告诉我。

4

0 回答 0