1

我有近 2 年的集群可用空间(以 GB 为单位)的每日时间序列数据。我正在尝试使用 facebook 的先知来做未来的预测。一些预测具有负值。由于负值没有意义,我发现使用物流增长模型的承载能力有助于消除具有上限值的负预测。我不确定这是否适用于这种情况以及如何获得我的时间序列的上限值。请帮忙,因为我是新手并且很困惑。我正在使用 Python 3.6

import numpy as np
import pandas as pd
import xlrd
import openpyxl
from pandas import datetime
import csv
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
from fbprophet import Prophet
import os
import sys
import signal




df = pd.read_excel("Data_Per_day.xlsx")
df1=df.filter(['cluster_guid','date','avail_capacity'],axis=1)
uniquevalues = np.unique(df1[['cluster_guid']].values)

for id in uniquevalues:


newdf = df1[df1['cluster_guid'] == id]

    newdf1=newdf.groupby(['cluster_guid','date'],as_index=False['avail_capacity'].sum()
        #newdf11=newdf.groupby(['cluster_guid','date'],as_index=False)['total_capacity'].sum()
        #cap[id]=newdf11['total_capacity'].max()
        #print(cap[id])
    newdf1.set_index('cluster_guid', inplace=True)

    newdf1.to_csv('my_csv.csv', mode='a',header=None)
with open('my_csv.csv',newline='') as f:
    r = csv.reader(f)
    data = [line for line in r]
with open('my_csv.csv','w',newline='') as f:
    w = csv.writer(f)
    w.writerow(['cluster_guid','DATE_TAKEN','avail_capacity'])
    w.writerows(data)





in_df = pd.read_csv('my_csv.csv', parse_dates=True, index_col='DATE_TAKEN' )

in_df.to_csv('my_csv.csv')

dfs= pd.read_csv('my_csv.csv')
uni=dfs.cluster_guid.unique()

while True:
    try:
        print(" Press Ctrl +C  to exit   or  enter the cluster guid to be forcasted")
        i=input('Please enter the cluster  guid')
        if i not in uni:
            print( 'Please  enter a  valid cluster  guid')
            continue
        else:

        dfs1=dfs.loc[df['cluster_guid'] == i]
        dfs1.drop('cluster_guid', axis=1, inplace=True)
        dfs1.to_csv('dataframe'+i+'.csv', index=False)
        dfs2=pd.read_csv('dataframe'+i+'.csv')
        dfs2['DATE_TAKEN'] = pd.DatetimeIndex(dfs2['DATE_TAKEN'])
        dfs2 = dfs2.rename(columns={'DATE_TAKEN': 'ds','avail_capacity': 'y'})
        my_model = Prophet(interval_width=0.99)
        my_model.fit(dfs2)
        future_dates = my_model.make_future_dataframe(periods=30, freq='D')
        forecast = my_model.predict(future_dates)
        print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']])
        my_model.plot(forecast,uncertainty=True)
        my_model.plot_components(forecast)
        plt.show()

        os.remove('dataframe'+i+'.csv')
        os.remove('my_csv.csv')


except KeyboardInterrupt: 
    try:
        os.remove('my_csv.csv')
    except OSError:
        pass
    sys.exit(0)
4

1 回答 1

4

0 阶 Box-Cox 变换完成了这个技巧。以下是步骤:

1. Add 1 to each values (so as to avoid log(0))
2. Take natural log of each value
3. Make forecasts
4. Take exponent and subtract 1

这样你就不会得到负面的预测。log 还具有将乘法季节性转换为加法形式的良好特性。

于 2018-09-03T12:30:00.820 回答