r - 填写缺失数据，并在 R 中复制其他数据

Question

我的数据以年为单位，并非所有集群都有从 1990 年到 2010 年所有年份的数据，所以首先我想填补所有 id 的缺失年份。然后我想填写我添加了年份的其余字段，并NA为我想要预测的其他字段创建值。我怎样才能在R中解决这个问题？

LAT        LONG    Cluster_ID year
13.5330 -15.4180   1            1990
13.5330 -15.4180   1            1992
13.5330 -15.4180   1            1995
13.5330 -15.4180   1            2010
13.5330 -15.4170   2            1995
13.5330 -15.4170   2            1997
13.5330 -15.4170   2             2005
13.5340 -14.9350   3             2005
13.5340 -14.9350   3             2006
13.5340 -15.9170   4             2010
13.3670 -14.6190   5             2006

score 1 · Accepted Answer

您只需创建一个包含所有可能组合的额外数据框，如下所示：

mycomb <- expand.grid(Cluster_ID = unique(mydat$Cluster_ID),
          year = 1990:2010)

有了那个，您可以进行合并：

merge(mydat,mycomb,all=TRUE)

以获得想要的结果。另见?expand.grid和?merge。

测试代码：

zz <- textConnection('LAT        LONG    Cluster_ID year
13.5330 -15.4180   1            1990
13.5330 -15.4180   1            1992
13.5330 -15.4180   1            1995
13.5330 -15.4180   1            2010
13.5330 -15.4170   2            1995
13.5330 -15.4170   2            1997
13.5330 -15.4170   2             2005
13.5340 -14.9350   3             2005
13.5340 -14.9350   3             2006
13.5340 -15.9170   4             2010
13.3670 -14.6190   5             2006')

mydat <- read.table(zz,header=TRUE)

r - 填写缺失数据，并在 R 中复制其他数据

1 回答 1

Related

Reference