我正在尝试在 R 中编写一些非常简单的东西(我认为),但我似乎无法做到正确。我有一个包含 50 个国家(1 到 50 个)的数据集,每个国家有 15 年,每个国家有大约 20 个变量。现在我只OS
在我的因变量 ( ) 上测试一个变量 ( SMD
)。我想按国家/地区循环执行此操作,因此我将获得每个国家/地区的输出而不是整体输出。
我认为首先创建一个子集是明智的(以便能够先查看国家 1,之后我的循环应该增加国家的数量并测试国家 2)。我相信我在页面底部的回归应该给我国家 1 的输出,而不是整个数据集的总分。但是我不断收到这些错误:
> pdata <- plm.data(newdata, index=c("Country","Date"))
series are constants and have been removed
> pooling <- plm(Y ~ X, data=pdata, model= "pooling")
series Country, xRegion are constants and have been removed
Error in model.matrix.pFormula(formula, data, rhs = 1, model = model, :
NA in the individual index variable
> summary(pooling)
Error in summary(pooling) : object 'pooling' not found
我可能认为这一切都是错误的,但我相信如果不让它工作,那么进一步对循环本身进行编程是没有意义的。非常感谢任何有关解决我的错误或其他编程循环方式的建议。
我的代码:
rm(list = ls())
mydata <- read.table(file = file.choose(), header = TRUE, dec = ",")
names(mydata)
attach(mydata)
Y <- cbind(SMD)
X <- cbind(OS)
newdata <- subset(mydata, Country %in% c(1))
newdata
pdata <- plm.data(newdata, index=c("Country","Date"))
pooling <- plm(Y ~ X, data=pdata, model= "pooling")
summary(pooling)
编辑:导致相同错误的前 2 个国家/地区的数据样本
dput(mydata)结构(列表(区域=结构(c(1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L,1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c( "NAF", "SAME"), class = "factor"), Country = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L) ,日期= C(1995L,1996L,1997L,1998L,1998L,1999L,2000L,2001L,2002L,2002L,2003L,2004L,2004L,2005L,2006L,2007L,2007L,2008L,2008L,2010L,2010L,2011L,2012L,2012L,2013L,2013L,2014L,1996L,19977L,19977年, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L 25, L = 结构22L、20L、23L、9L、7L、5L、2L、1L、4L, 3L, 6L, 10L, 11L, 13L, 11L, 8L, 26L, 25L, 31L, 29L, 28L, 21L, 30L, 24L, 24L, 16L, 11L, 14L, 12L, 17L, 18L, 29L, 32L, 32L, 33L, 34L), .Label = c("51.5", "52.2", "55.6", "56.4", "56.7", "57.7", "57.8", "58.3", "59", "59.2 ”、“59.6”、“59.9”、“60.2”、“60.4”、“61.1”、“61.2”、“62.2”、“62.3”、“62.8”、“63.2”、“63.3”、“63.8”、 “63.9”、“64.2”、“64.3”、“64.5”、“64.7”、“65.3”、“65.5”、“65.6”、“66.4”、“68”、“69.6”、“70.7”)、类=“因子”),SMD =结构(c(7L,12L,20L,21L,17L,15L,13L, 10L, 14L, 22L, 23L, 33L, 1L, 32L, 29L, 34L, 28L, 25L, NA, NA, 9L, 6L, 8L, 4L, 2L, 35L, 3L, 36L, 5L, 11L, 16L, 18L, 24L, 19L, 26L, 31L, 27L, 30L, NA, NA), .Label = c(“100.3565662”, “13.44788845”, “13.45858747”, “13.56815534”, “15.05892471”, “171.8.0478”, “171.8.0478”, “171.8.0478” ", "18.3101351", "19.34226196", "21.25530884", "21.54423145", "23.75898948", "24.08770926", "26.39817342", "29.44079001", "31.40605191", "34.46667996", "34.52913657", "35.66070947", “36.4419931”、“39.16875621”、“44.0126137”、“45.72949566”、“49.13062679”、“54.83730247”、“ 56.87886311”,“ 59.80971583”,“ 60.5658962”,“ 69.20148901”,“ 70.91362874”,“ 72.64845214 )), .Names = c("Region", "Country", "Date", "OS", "SMD"), class = "data.frame", row.names = c(NA, -40L))国家”,“日期”,“操作系统”,“SMD”),类 =“data.frame”,row.names = c(NA,-40L))国家”,“日期”,“操作系统”,“SMD”),类 =“data.frame”,row.names = c(NA,-40L))