0

我有一个不平衡的面板数据,我需要包含所有缺失的观察结果。例如,我有这样的事情:

         YEAR    VAR
 FIRM.1  YEAR.1  x.1
 FIRM.1  YEAR.3  x.2
 FIRM.2  YEAR.2  x.3
 FIRM.2  YEAR.3  x.4

我想添加缺少的 NA:

         YEAR    VAR
 FIRM.1  YEAR.1  x.1
 FIRM.1  YEAR.2  NA
 FIRM.1  YEAR.3  x.2
 FIRM.2  YEAR.1  NA
 FIRM.2  YEAR.2  x.3
 FIRM.2  YEAR.3  x.4  

如何最方便地做到这一点?

4

1 回答 1

3

我会使用expand.gridand merge

假设您的数据如下:

mydf <- structure(list(FIRM = c("FIRM.1", "FIRM.1", "FIRM.2", "FIRM.2"),
    YEAR = c("YEAR.1", "YEAR.3", "YEAR.2", "YEAR.3"), VAR = c("x.1", "x.2",
    "x.3", "x.4")), .Names = c("FIRM", "YEAR", "VAR"),
    class = "data.frame", row.names = c(NA, -4L))
mydf
#     FIRM   YEAR VAR
# 1 FIRM.1 YEAR.1 x.1
# 2 FIRM.1 YEAR.3 x.2
# 3 FIRM.2 YEAR.2 x.3
# 4 FIRM.2 YEAR.3 x.4

用于expand.grid创建“完整”集的“FIRM”和“YEAR”数据,然后merge.

merge(mydf, expand.grid(FIRM = unique(mydf$FIRM), 
                        YEAR = unique(mydf$YEAR)), 
      all.y = TRUE)
#     FIRM   YEAR  VAR
# 1 FIRM.1 YEAR.1  x.1
# 2 FIRM.1 YEAR.2 <NA>
# 3 FIRM.1 YEAR.3  x.2
# 4 FIRM.2 YEAR.1 <NA>
# 5 FIRM.2 YEAR.2  x.3
# 6 FIRM.2 YEAR.3  x.4
于 2013-10-01T09:34:58.127 回答