我习惯于使用 Stata 或 R 来做线性回归模型,但我正在将更多的工作流程转移到 Python。
这两个程序的有用之处在于它们直观地知道您并不关心线性模型中的所有实体或时间固定效应,因此在估计面板模型时,它们将从模型中删除多重共线假人(报告哪个他们掉落的那些)。
虽然我知道以这种方式估计模型并不理想,并且应该小心运行回归(等),但这在实践中很有用,因为这意味着您可以首先看到结果,并担心一些细微差别稍后的假人(特别是因为您不关心完全饱和的固定效应模型中的假人)。
让我举个例子。以下需要linearmodels
并加载数据集并尝试运行面板回归。这是他们文档中示例的修改版本。
# Load the data (requires statsmodels and linearmodels)
import statsmodels.api as sm
from linearmodels.datasets import wage_panel
import pandas as pd
data = wage_panel.load()
year = pd.Categorical(data.year)
data = data.set_index(['nr', 'year'])
data['year'] = year
print(wage_panel.DESCR)
print(data.head())
# Run the panel regression
from linearmodels.panel import PanelOLS
exog_vars = ['exper','union','married']
exog = sm.add_constant(data[exog_vars])
mod = PanelOLS(data.lwage, exog, entity_effects=True, time_effects=True)
fe_te_res = mod.fit()
print(fe_te_res)
这给出了以下错误:
AbsorbingEffectError:无法估计模型。所包含的效应已经完全吸收了一个或多个变量。当使用模型中包含的效果完美地解释了一个或多个因变量时,就会发生这种情况。
但是,如果您在 Stata 中通过将相同的数据导出到 Stata 进行估计,则运行:
data.drop(columns='year').to_stata('data.dta')
然后在您的 stata 文件中运行等效文件(加载数据后):
xtset nr year
xtreg lwage exper union married i.year, fe
这将执行以下操作:
> . xtreg lwage exper union married i.year, fe
note: 1987.year omitted because of collinearity
Fixed-effects (within) regression Number of obs = 4360
Group variable: nr Number of groups = 545
R-sq: within = 0.1689 Obs per group: min = 8
between = 0.0000 avg = 8.0
overall = 0.0486 max = 8
F(9,3806) = 85.95
corr(u_i, Xb) = -0.1747 Prob > F = 0.0000
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0638624 .0032594 19.59 0.000 .0574721 .0702527
union | .0833697 .0194393 4.29 0.000 .0452572 .1214821
married | .0583372 .0183688 3.18 0.002 .0223235 .0943509
|
year |
1981 | .0496865 .0200714 2.48 0.013 .0103348 .0890382
1982 | .0399445 .019123 2.09 0.037 .0024521 .0774369
1983 | .0193513 .018662 1.04 0.300 -.0172373 .0559398
1984 | .0229574 .0186503 1.23 0.218 -.0136081 .0595229
1985 | .0081499 .0191359 0.43 0.670 -.0293677 .0456674
1986 | .0036329 .0200851 0.18 0.856 -.0357456 .0430115
1987 | 0 (omitted)
|
_cons | 1.169184 .0231221 50.57 0.000 1.123851 1.214517
-------------+----------------------------------------------------------------
sigma_u | .40761229
sigma_e | .35343397
rho | .57083029 (fraction of variance due to u_i)
------------------------------------------------------------------------------
请注意,stata 从回归中任意删除了 1987 年,但仍然运行。有没有办法在linearmodels
or中获得类似的功能statsmodels
?