1

我每周在 drake 中为 4273 个时间序列打包了一个庞大的时间序列工作流(4273*10 模型)。

最初我尝试使用 fable 包创建完整的工作流程。这对于为分组 tsibble 训练模型非常方便,但是经过不同的试验后,我在内存管理方面遇到了很多问题。当我尝试序列化模型时,我的具有 32 个内核和 244 GB 内存的 RStudio 服务器经常崩溃。

因此,我完全放弃了我的工作流程,以识别来自以下方面的瓶颈:

在此处输入图像描述

至:

在此处输入图像描述

然后到:

在此处输入图像描述

最后一个:

在此处输入图像描述

在我的训练代码(例如property_multiplicative)中,我正在使用future 包来训练这个多个寓言模型,然后计算准确性并保存它们。但是我不知道之后如何从德雷克工作流程中删除此对象:

  • 我应该只使用 rm 删除对象吗?
  • drake 有什么方法可以为每个工作流组件提供单独的环境吗?
  • 这是正确的解决方案吗?

我的想法是以串行方式运行每种单独的技术,同时并行训练一种特定技术的 4273 个模型。这样做我希望不会使服务器崩溃,然后在我的所有模型都经过训练后,我可以加入准确度指标,为我的每个时间序列选择最佳模型,然后修剪每个单独的二进制文件以生成预测。

任何对我的方法的建议都非常受欢迎。请注意,我的硬件资源非常有限,因此无法选择更大的服务器。

BR/E

4

3 回答 3

2

内存和速度之间总是存在权衡。为了节省内存,我们必须从会话中卸载一些目标,这通常需要我们稍后花时间从存储中读取它们。drake 的默认行为是有利于速度。所以在你的情况下,我会在 make() 和相关函数中设置 memory_strategy = “autoclean” 和garbage_collection = TRUE。用户手册有一章专门介绍内存管理:https ://books.ropensci.org/drake/memory.html 。

另外,我建议尽可能返回小目标。因此,您可以返回一个模型摘要的小数据框,而不是返回整个拟合模型,这对内存和存储都更友好。最重要的是,您可以在https://books.ropensci.org/drake/plans.html#special-data-formats-for-targets选择一种专门的存储格式,以获得更高的效率。

于 2020-07-04T16:45:25.267 回答
0

garbage_collection = TRUE 已经设置好了。我会尝试添加自动清洁。关于文件格式,我使用函数 save_model_x 将模型保存为 .qs 和 qs 库:

saveModels <- function(models, directory_out, max_forecasting_horizon, max_multisession_cores) {
print("Saving the all-mighty mable")
save(x = models, file = paste0(directory_out, attributes(models)$model, "_horizon_", max_forecasting_horizon, ".qs"), 
     nthreads = max_multisession_cores)
#saveRDS(object = models, file = paste0(directory_out, "ts_models_horizon_", max_forecasting_horizon, ".rds"))
print("End workflow")
}

在我的计划中,这被用作:

prophet_multiplicative = trainModels(input_data = processed_data, 
                               max_forecast_horizon = argument_parser$horizon,
                               max_multisession_cores = 6,
                               model_type = "prophet_multiplicative"),
  accuracy_prophet_multiplicative = accuracy_explorer(type = "train", models = prophet_multiplicative, 
                                                      max_forecast_horizon = argument_parser$horizon,
                                                      directory_out = "/data1/my_folder/"),
  saving_prophet_multiplicative = saveModels(models = prophet_multiplicative, 
                       directory_out = "/data1/my_folder/,
                       max_forecasting_horizon = argument_parser$horizon,
                       max_multisession_cores = 6)

根据您的建议,我的计划详情如下:

make(plan = plan, verbose = 2, 
     log_progress = TRUE,
     recover = TRUE,
     lock_envir = FALSE,
     garbage_collection = TRUE,
     memory_strategy = "autoclean")

有什么建议么?

BR

/E

于 2020-07-04T18:12:50.077 回答
0

谢谢你的快速回答,我真的很感激。现在我面临另一个问题,我让脚本在晚上通过 nohup 运行,我在日志中发现了以下内容:

[1] "DB PROD Connected"
[1] "DB PROD Connected"
[1] "Getting RAW data"
[1] "Maximum forecasting horizon is 52, fetching weekly data"
[1] "Removing duplicates if we have them"
[1] "Original data has 1860590 rows"
[1] "Data without duplicates has 1837995 rows"
`summarise()` regrouping output by 'A', 'B' (override with `.groups` argument)
[1] "Removing non active customers"
[1] "Data without duplicates and without active customers has 1654483 rows"
0.398 sec elapsed
[1] "Removing customers with last data older than 1.5 years"
[1] "Data without duplicates, customers that are not active and old customers has 1268610 rows"
0.223 sec elapsed
[1] "Augmenting data"
12.103 sec elapsed
[1] "Creating tsibble"
7.185 sec elapsed
[1] "Filling gaps for not breaking groups"
9.568 sec elapsed
[1] "Training theta models for forecasting horizon 52"
[1] "Using 12 sessions from as future::plan()"
Repacking large object
[1] "Training auto_arima models for forecasting horizon 52"
[1] "Using 12 sessions from as future::plan()"
Error: target auto_arima failed.
diagnose(auto_arima)error$message:
  object 'ts_models' not found
diagnose(auto_arima)error$calls:
  1. └─global::trainModels(...)
In addition: Warning message:
9 errors (2 unique) encountered for theta
[3] function cannot be evaluated at initial parameters
[6] Not enough data to estimate this ETS model.

Execution halted
            

对象 ts_models 是在我的训练脚本中创建的对象,它基本上是我的函数 trainModels 返回的对象。在我看来,也许输入数据参数是干净的,这就是它失败的原因?

另一个问题由于某种原因,我的模型在训练 thetha 模型后没有保存。有没有办法指定德雷克在计算一个模型的准确性并保存 .qs 文件之前不跳转到下一个模型?

我的训练功能如下:

trainModels <- function(input_data, max_forecast_horizon, model_type, max_multisession_cores) {

  options(future.globals.maxSize = 1500000000)
  future::plan(multisession, workers = max_multisession_cores) #breaking infrastructure once again ;)
  set.seed(666) # reproducibility
  
    if(max_forecast_horizon <= 104) {
      
      print(paste0("Training ", model_type, " models for forecasting horizon ", max_forecast_horizon))
      print(paste0("Using ", max_multisession_cores, " sessions from as future::plan()"))
      
      if(model_type == "prophet_multiplicative") {
        
        ts_models <- input_data %>% model(prophet = fable.prophet::prophet(snsr_val_clean ~ season("week", 2, type = "multiplicative") + 
                                                                             season("month", 2, type = "multiplicative")))
        
      } else if(model_type == "prophet_additive") {
        
        ts_models <- input_data %>% model(prophet = fable.prophet::prophet(snsr_val_clean ~ season("week", 2, type = "additive") + 
                                                                             season("month", 2, type = "additive")))
        
      } else if(model_type == "auto.arima") {
        
        ts_models <- input_data %>% model(auto_arima = ARIMA(snsr_val_clean))
        
      } else if(model_type == "arima_with_yearly_fourier_components") {
        
        ts_models <- input_data %>% model(auto_arima_yf = ARIMA(snsr_val_clean ~ fourier("year", K = 2)))
        
      } else if(model_type == "arima_with_monthly_fourier_components") {
        
        ts_models <- input_data %>% model(auto_arima_mf = ARIMA(snsr_val_clean ~ fourier("month", K=2)))
        
      } else if(model_type == "regression_with_arima_errors") {
        
        ts_models <- input_data %>% model(auto_arima_mf_reg = ARIMA(snsr_val_clean ~ month + year  + quarter + qday + yday + week))
        
      } else if(model_type == "tslm") {
    
        ts_models <- input_data %>% model(tslm_reg_all = TSLM(snsr_val_clean ~ year  + quarter + month + day + qday + yday + week + trend()))
     
      } else if(model_type == "theta") {
        
        ts_models <- input_data %>% model(theta = THETA(snsr_val_clean ~ season()))
        
      } else if(model_type == "ensemble") {
        
        ts_models <- input_data %>% model(ensemble =  combination_model(ARIMA(snsr_val_clean), 
                                              ARIMA(snsr_val_clean ~ fourier("month", K=2)),
                                              fable.prophet::prophet(snsr_val_clean ~ season("week", 2, type = "multiplicative") +
                                              season("month", 2, type = "multiplicative"), 
                                              theta = THETA(snsr_val_clean ~ season()), 
                                              tslm_reg_all = TSLM(snsr_val_clean ~ year  + quarter + month + day + qday + yday + week + trend())))
            )
        
      }
      
    } 
  
    else if(max_forecast_horizon > 104) {
      
        print(paste0("Training ", model_type, " models for forecasting horizon ", max_forecast_horizon))
        print(paste0("Using ", max_multisession_cores, " sessions from as future::plan()"))
        
        
        if(model_type == "prophet_multiplicative") {
          
          ts_models <- input_data %>% model(prophet = fable.prophet::prophet(snsr_val_clean ~ season("month", 2, type = "multiplicative") + 
                                                                               season("month", 2, type = "multiplicative")))
          
        } else if(model_type == "prophet_additive") {
          
          ts_models <- input_data %>% model(prophet = fable.prophet::prophet(snsr_val_clean ~ season("month", 2, type = "additive") + 
                                                                               season("year", 2, type = "additive")))
          
        } else if(model_type == "auto.arima") {
          
          ts_models <- input_data %>% model(auto_arima = ARIMA(snsr_val_clean))
          
        } else if(model_type == "arima_with_yearly_fourier_components") {
          
          ts_models <- input_data %>% model(auto_arima_yf = ARIMA(snsr_val_clean ~ fourier("year", K = 2)))
          
        } else if(model_type == "arima_with_monthly_fourier_components") {
          
          ts_models <- input_data %>% model(auto_arima_mf = ARIMA(snsr_val_clean ~ fourier("month", K=2)))
          
        } else if(model_type == "regression_with_arima_errors") {
          
          ts_models <- input_data %>% model(auto_arima_mf_reg = ARIMA(snsr_val_clean ~ month + year  + quarter + qday + yday))
          
        } else if(model_type == "tslm") {
          
          ts_models <- input_data %>% model(tslm_reg_all = TSLM(snsr_val_clean ~ year  + quarter + month + day + qday + yday + trend()))
          
        } else if(model_type == "theta") {
          
          ts_models <- input_data %>% model(theta = THETA(snsr_val_clean ~ season()))
          
        } else if(model_type == "ensemble") {
          
          ts_models <- input_data %>% model(ensemble =  combination_model(ARIMA(snsr_val_clean), 
                                                                          ARIMA(snsr_val_clean ~ fourier("month", K=2)),
                                                                          fable.prophet::prophet(snsr_val_clean ~ season("month", 2, type = "multiplicative") +
                                                                          season("year", 2, type = "multiplicative"),
                                                                          theta = THETA(snsr_val_clean ~ season()), 
                                                                          tslm_reg_all = TSLM(snsr_val_clean ~ year  + quarter + month + day + qday + 
                                                                                                yday  + trend())))
          )
          
        }
    }
  
  return(ts_models)
}

BR/E

于 2020-07-05T08:51:18.483 回答