我是 R 的狂热用户,但最近出于几个不同的原因切换到 Python。但是,我在 statsmodels 中运行 Python 中的矢量 AR 模型时遇到了一些困难。
问#1。运行此程序时出现错误,我怀疑它与我的向量类型有关。
import numpy as np
import statsmodels.tsa.api
from statsmodels import datasets
import datetime as dt
import pandas as pd
from pandas import Series
from pandas import DataFrame
import os
df = pd.read_csv('myfile.csv')
speedonly = DataFrame(df['speed'])
results = statsmodels.tsa.api.VAR(speedonly)
Traceback (most recent call last):
File "<pyshell#14>", line 1, in <module>
results = statsmodels.tsa.api.VAR(speedonly)
File "C:\Python27\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 336, in __init__
super(VAR, self).__init__(endog, None, dates, freq)
File "C:\Python27\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 40, in __init__
self._init_dates(dates, freq)
File "C:\Python27\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 54, in _init_dates
raise ValueError("dates must be of type datetime")
ValueError: dates must be of type datetime
现在,有趣的是,当我从这里https://github.com/statsmodels/statsmodels/blob/master/docs/source/vector_ar.rst#id5运行 VAR 示例时,它工作正常。
我尝试使用 Wes McKinney 的“Python for Data Analysis”第 293 页中的第三个较短向量 ts 的 VAR 模型,但它不起作用。
好的,所以现在我想这是因为向量是不同的类型:
>>> speedonly.head()
speed
0 559.984
1 559.984
2 559.984
3 559.984
4 559.984
>>> type(speedonly)
<class 'pandas.core.frame.DataFrame'> #DOESN'T WORK
>>> type(data)
<type 'numpy.ndarray'> #WORKS
>>> ts
2011-01-02 -0.682317
2011-01-05 1.121983
2011-01-07 0.507047
2011-01-08 -0.038240
2011-01-10 -0.890730
2011-01-12 -0.388685
>>> type(ts)
<class 'pandas.core.series.TimeSeries'> #DOESN'T WORK
所以我将 speedonly 转换为 ndarray ......它仍然无法正常工作。但这一次我得到另一个错误:
>>> nda_speedonly = np.array(speedonly)
>>> results = statsmodels.tsa.api.VAR(nda_speedonly)
Traceback (most recent call last):
File "<pyshell#47>", line 1, in <module>
results = statsmodels.tsa.api.VAR(nda_speedonly)
File "C:\Python27\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 345, in __init__
self.neqs = self.endog.shape[1]
IndexError: tuple index out of range
有什么建议么?
问#2。我的数据集中有似乎对预测有用的外生特征变量。来自 statsmodels 的上述模型是否是最好的模型?