r - 如何使用 plm 包比较 R 中的 2 个模型？

Question

所以我正在使用plmR 中的包运行一个固定效果模型，我想知道如何比较两个模型中哪一个更合适。

例如，这是我构建的两个模型的代码：

library(plm)

eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core, 
               data=euro, 
               model="within")

eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core, 
                data=euro,
                model="within")

我知道通过常规lm调用，我可以通过运行方差分析测试来比较两个模型，但在这种情况下似乎不起作用。我总是收到以下错误：

Error in UseMethod("anova") : 
  no applicable method for 'anova' applied to an object of class "c('plm', 'panelmodel')"

有人知道如何处理plm包裹吗？沃尔德测试合适吗？

score 5 · Accepted Answer

以下代码回答了Cross Validated The question there also about test (joint) hypothesis in routine中的一个类似问题。plm将代码应用于您的问题应该很简单。

library(plm)  # Use plm
library(car)  # Use F-test in command linearHypothesis
library(tidyverse)
data(egsingle, package = 'mlmRev')
dta <- egsingle %>% mutate(Female = recode(female, .default = 0L, `Female` = 1L))
plm1 <- plm(math ~ Female * (year), data = dta, index = c('childid', 'year', 'schoolid'), model = 'within')

# Output from `summary(plm1)` --- I deleted a few lines to save space.
# Coefficients:
#                 Estimate Std. Error t-value Pr(>|t|)    
# year-1.5          0.8842     0.1008    8.77   <2e-16 ***
# year-0.5          1.8821     0.1007   18.70   <2e-16 ***
# year0.5           2.5626     0.1011   25.36   <2e-16 ***
# year1.5           3.1680     0.1016   31.18   <2e-16 ***
# year2.5           3.9841     0.1022   38.98   <2e-16 ***
# Female:year-1.5  -0.0918     0.1248   -0.74     0.46    
# Female:year-0.5  -0.0773     0.1246   -0.62     0.53    
# Female:year0.5   -0.0517     0.1255   -0.41     0.68    
# Female:year1.5   -0.1265     0.1265   -1.00     0.32    
# Female:year2.5   -0.1465     0.1275   -1.15     0.25    
# ---

xnames <- names(coef(plm1)) # a vector of all independent variables' names in 'plm1'
# Use 'grepl' to construct a vector of logic value that is TRUE if the variable
# name starts with 'Female:' at the beginning. This is generic, to pick up
# every variable that starts with 'year' at the beginning, just write
# 'grepl('^year+', xnames)'.
picked <- grepl('^Female:+', xnames)
linearHypothesis(plm1, xnames[picked])

# Hypothesis:
# Female:year - 1.5 = 0
# Female:year - 0.5 = 0
# Female:year0.5 = 0
# Female:year1.5 = 0
# Female:year2.5 = 0
# 
# Model 1: restricted model
# Model 2: math ~ Female * (year)
# 
#   Res.Df Df Chisq Pr(>Chisq)
# 1   5504                    
# 2   5499  5  6.15       0.29

score 1 · Accepted Answer

我也为此苦苦挣扎，但最终想出了以下解决方案（在博士朋友的帮助下）。使用您的示例，请参阅下面的示例解决方案。

使用 AIC 标准来比较面板模型，如下所示：

library(plm)

eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core, 
               data=euro, 
               model="within")

eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core, 
                data=euro,
                model="within")

# AIC = log(RSS/N) + 2K/N  for linear models
# AIC = log(RSS/n) + 2K/n  for panel models

Sum1 <- summary(eurofix)
RSS1 <- sum(Sum1$residuals^2)
K1 <- max(eurofix$assign)
N1 <- length(eurofix$residuals)
n1 <- N1 - K1 - eurofix$df.residual

AIC_eurofix = log(RSS1/n1) + (2*K1)/n1

Sum2 <- summary(eurofix2)
RSS2 <- sum(Sum2$residuals^2)
K2 <- max(eurofix2$assign)
N2 <- length(eurofix2$residuals)
n2 <- N2 - K2 - eurofix2$df.residual

AIC_eurofix2 = log(RSS2/n2) + (2*K2)/n2

较低的 AIC 值是首选型号！

score 0 · Accepted Answer

你用过这个plm功能anova()吗？根据您如何写问题，我无法判断。如果你没有，试一试。

如果您的问题更多是关于选择一种帮助您在模型之间进行判断的方法而不是技术问题的统计问题，那么答案实际上取决于您定义“合适”的方式。如果两个模型中的唯一区别是包含在两个模型fx中的第一个中，则几个统计测试可以评估您的模型最小化平方误差（例如，R^2）或由于非随机而达不到的程度残差分布（例如，VIF）。

如果您想知道 include 是否fx产生适合您的数据的模型，该模型在一定程度上可以抵抗过度拟合，请考虑使用BIC。我通常偏爱 BIC，因为它比其他模型拟合统计数据（如 AIC）更积极地惩罚附加参数。BIC 最低的模型往往是最佳拟合模型（尽管您也应该使用 Wald 检验/F 检验来确认 IMO，尤其是当您的嵌套模型是它们的理想用例时）。您应该能够使用plm如下所示获取模型对象的 BIC 值：

anova(model1, model2)

如果这不起作用，我发现lme4package 函数很有用：

BIC(model1, model2)

如果我误解了这个问题，请告诉我 - 让我们知道您发现了什么！

r - 如何使用 plm 包比较 R 中的 2 个模型？

3 回答 3

Related

Reference