python - 使用 Python 进行具有固定效应的面板数据回归

Question

我将以下面板存储在df：

	状态	区	年	是的	持续的	x1	x2	时间
0	01	01001	2009	12	1	0.956007	639673	1
1	01	01001	2010	20	1	0.972175	639673	2
2	01	01001	2011	22	1	0.988343	639673	3
3	01	01002	2009	0	1	0	33746	1
4	01	01002	2010	1	1	0.225071	33746	2
5	01	01002	2011	5	1	0.450142	33746	3
6	01	01003	2009	0	1	0	45196	1
7	01	01003	2010	5	1	0.427477	45196	2
8	01	01003	2011	9	1	0.854955	45196	3

y是每个地区的抗议数量
constant是一列充满的
x1是移动网络提供商覆盖的地区区域的比例
x2是各区的人口数（注意是时间固定的）

如何在 Python 中运行以下模型？

这是我尝试过的

# Transform `x2` to match model
df['x2'] = df['x2'].multiply(df['time'], axis=0)
# District fixed effects
df['delta'] = pd.Categorical(df['district'])
# State-time fixed effects
df['eta'] = pd.Categorical(df['state'] + df['year'].astype(str))
# Set indexes
df.set_index(['district','year'])

from linearmodels.panel import PanelOLS
m = PanelOLS(dependent=df['y'], exog=df[['constant','x1','x2','delta','eta']])

ValueError: exog 没有完整的列排名。如果您希望继续进行模型估计而不考虑系数估计的数值精度，您可以设置 rank_check=False。

我究竟做错了什么？

score 4 · Accepted Answer

我翻遍了文档，结果证明解决方案非常简单。

设置索引并将固定效果列转换为pandas.Categorical类型后（参见上面的问题）：

# Import model
from linearmodels.panel import PanelOLS

# Model
m = PanelOLS(dependent=df['y'],
             exog=df[['constant','x1','x2']],
             entity_effects=True,
             time_effects=False,
             other_effects=df['eta'])
m.fit(cov_type='clustered', cluster_entity=True)

也就是说，不要将您的固定效果列传递给 exog.

您应该将它们传递给entity_effects(boolean)、time_effects(boolean) 或other_effects(pandas.Categorical)。

python - 使用 Python 进行具有固定效应的面板数据回归

1 回答 1

Related

Reference