我最近遇到了 pycausalimpact。
https://pypi.org/project/pycausalimpact/
import numpy as np
import pandas as pd
from statsmodels.tsa.arima_process import ArmaProcess
from causalimpact import CausalImpact
np.random.seed(12345)
ar = np.r_[1, 0.9]
ma = np.array([1])
arma_process = ArmaProcess(ar, ma)
X = 100 + arma_process.generate_sample(nsample=100)
y = 1.2 * X + np.random.normal(size=100)
y[70:] += 5
data = pd.DataFrame({'y': y, 'X': X}, columns=['y', 'X'])
pre_period = [0, 69]
post_period = [70, 99]
ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
该通用代码在给定示例中运行良好。现在,我正在尝试在该 pycausalimpact 示例中运行我自己的数据,如下所示。
import sys
import os
import numpy as np
import pandas as pd
from IPython.core.pylabtools import figsize
import statsmodels as sm
from statsmodels.tsa.statespace.structural import UnobservedComponents
from statsmodels.tsa.arima_process import ArmaProcess
from matplotlib import pyplot as plt
from causalimpact import CausalImpact
import warnings
y = fus['days']
X = fus[['market_cat',
'mmepool_cat',
'submarket_cat',
'local_market_cat',
'project_type_cat',
'site_status_cat',
'city_cat',
'state_cat']]
我正在努力的部分在这里:
ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
我需要如何准备“数据”、“pre_period”和“post_period”,以便在我的特定数据集中工作?这是我的一些实际数据。
y =
X =
基本上,我想看看自变量和因变量(天)之间是否存在某种因果关系。或者......是否有更好/替代的方法来确定和衡量因果关系?谢谢。