我正在尝试在 Python 中使用 htsprophet 包。我正在使用下面的示例代码。此示例来自https://github.com/CollinRooney12/htsprophet/blob/master/htsprophet/runHTS.py。我得到的问题是 ValueError "holidays must be a DataFrame with 'ds' and 'holiday' column
。我想知道是否有解决方法,因为我显然有一个数据框假期,其中包含 ds 和假期两列。我相信错误来自预测器文件中 fbprophet 的依赖包之一。我想知道是否有任何我需要添加的东西,或者是否有人添加了一些东西来解决这个问题。
import pandas as pd
from htsprophet.hts import hts, orderHier, makeWeekly
from htsprophet.htsPlot import plotNode, plotChild, plotNodeComponents
import numpy as np
#%% Random data (Change this to whatever data you want)
date = pd.date_range("2015-04-02", "2017-07-17")
date = np.repeat(date, 10)
medium = ["Air", "Land", "Sea"]
businessMarket = ["Birmingham","Auburn","Evanston"]
platform = ["Stone Tablet","Car Phone"]
mediumDat = np.random.choice(medium, len(date))
busDat = np.random.choice(businessMarket, len(date))
platDat = np.random.choice(platform, len(date))
sessions = np.random.randint(1000,10000,size=(len(date),1))
data = pd.DataFrame(date, columns = ["day"])
data["medium"] = mediumDat
data["platform"] = platDat
data["businessMarket"] = busDat
data["sessions"] = sessions
#%% Run HTS
##
# Make the daily data weekly (optional)
##
data1 = makeWeekly(data)
##
# Put the data in the format to run HTS, and get the nodes input (a list of list that describes the hierarchical structure)
##
data2, nodes = orderHier(data, 1, 2, 3)
##
# load in prophet inputs (Running HTS runs prophet, so all inputs should be gathered beforehand)
# Made up holiday data
##
holidates = pd.date_range("12/25/2013","12/31/2017", freq = 'A')
holidays = pd.DataFrame(["Christmas"]*5, columns = ["holiday"])
holidays["ds"] = holidates
holidays["lower_window"] = [-4]*5
holidays["upper_window"] = [0]*5
##
# Run hts with the CVselect function (this decides which hierarchical aggregation method to use based on minimum mean Mean Absolute Scaled Error)
# h (which is 12 here) - how many steps ahead you would like to forecast. If youre using daily data you don't have to specify freq.
#
# NOTE: CVselect takes a while, so if you want results in minutes instead of half-hours pick a different method
##
myDict = hts(data2, 52, nodes, holidays = holidays, method = "FP", transform = "BoxCox")
##