1

I have imputed missing values using Amelia thereby creating 5 multiply imputed datasets. Now, I would like to split this multi-dataset, e.g. one set for year => 1990 and one set for year =<1990. Any ideas how I can do so? Many thanks!

data(freetrade)
freetrade$year #splitting variable

#Imputation of missing data
a.out <- amelia(freetrade, m=5, ts="year", cs="country")

#split of created dataset?
4

1 回答 1

2

Amelia 返回一个包含数据框列表的对象(对于每个插补)。您可以使用 来查看此对象的结构str()

> library(Amelia)
> data(freetrade)
> 
> a.out <- amelia(freetrade, m=5, ts="year", cs="country")
-- Imputation 1 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 2 --

  1  2  3  4  5  6  7  8  9 10 11 12 13

-- Imputation 3 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19

-- Imputation 4 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

-- Imputation 5 --

  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20


> str(a.out)
List of 12
 $ imputations:List of 5
  ..$ imp1:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 30.6 22.4 41.3 26.8 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp2:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 33.6 59.7 41.3 18.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp3:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 48.5 32.9 41.3 47.2 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp4:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 18.4 45.5 41.3 16.9 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..$ imp5:'data.frame':    171 obs. of  10 variables:
  .. ..$ year    : int [1:171] 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 ...
  .. ..$ country : chr [1:171] "SriLanka" "SriLanka" "SriLanka" "SriLanka" ...
  .. ..$ tariff  : num [1:171] 15.3 44.4 41.3 40.1 31 ...
  .. ..$ polity  : num [1:171] 6 5 5 5 5 5 5 5 5 5 ...
  .. ..$ pop     : num [1:171] 14988000 15189000 15417000 15599000 15837000 ...
  .. ..$ gdp.pc  : num [1:171] 461 474 489 508 526 ...
  .. ..$ intresmi: num [1:171] 1.94 1.96 1.66 2.8 2.26 ...
  .. ..$ signed  : num [1:171] 0 0 1 0 0 0 0 1 0 0 ...
  .. ..$ fiveop  : num [1:171] 12.4 12.5 12.3 12.3 12.3 ...
  .. ..$ usheg   : num [1:171] 0.259 0.256 0.266 0.299 0.295 ...
  ..- attr(*, "class")= chr [1:2] "mi" "list"
 $ m          : num 5
 $ missMatrix : logi [1:171, 1:10] FALSE FALSE FALSE FALSE FALSE FALSE ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr [1:10] "year" "country" "tariff" "polity" ...
 $ overvalues : NULL
 $ theta      : num [1:9, 1:9, 1:5] -1 -0.08456 -0.03404 -0.00193 0.06483 ...
 $ mu         : num [1:8, 1:5] -0.08456 -0.03404 -0.00193 0.06483 -0.11178 ...
 $ covMatrices: num [1:8, 1:8, 1:5] 0.7881 -0.1869 -0.0531 0.2121 -0.0819 ...
 $ code       : num 1
 $ message    : chr "Normal EM convergence."
 $ iterHist   :List of 5
  ..$ : num [1:15, 1:3] 44 34 25 28 26 25 24 22 20 14 ...
  ..$ : num [1:13, 1:3] 44 27 24 22 22 21 18 17 14 11 ...
  ..$ : num [1:19, 1:3] 44 34 29 27 26 26 25 24 23 21 ...
  ..$ : num [1:15, 1:3] 44 34 27 28 23 24 23 23 19 19 ...
  ..$ : num [1:20, 1:3] 44 32 30 27 24 23 23 23 23 21 ...
 $ arguments  :List of 22
  ..$ idvars      : NULL
  ..$ logs        : NULL
  ..$ ts          : num 1
  ..$ cs          : num 2
  ..$ empri       : NULL
  ..$ tolerance   : num 1e-04
  ..$ polytime    : NULL
  ..$ splinetime  : NULL
  ..$ lags        : NULL
  ..$ leads       : NULL
  ..$ intercs     : logi FALSE
  ..$ sqrts       : NULL
  ..$ lgstc       : NULL
  ..$ noms        : NULL
  ..$ ords        : NULL
  ..$ priors      : NULL
  ..$ autopri     : num 0.05
  ..$ bounds      : NULL
  ..$ max.resample: num 100
  ..$ startvals   : num 0
  ..$ overimp     : NULL
  ..$ emburn      : num [1:2] 0 0
  ..- attr(*, "class")= chr [1:2] "ameliaArgs" "list"
 $ orig.vars  : chr [1:10] "year" "country" "tariff" "polity" ...
 - attr(*, "class")= chr "amelia"

从这里您可以看到 a.out 对象的“估算”元素包含您的数据框,因此您可以从那里引用您的每个估算。例如a.out$imputations[[1]]$year,将为您提供第一次估算的年份。如果您喜欢在每个插补中执行此操作,则可以使用应用函数或循环来执行此操作。为了说明这一点,请考虑:

> sapply(a.out$imputations,function(x) head(x$year))
     imp1 imp2 imp3 imp4 imp5
[1,] 1981 1981 1981 1981 1981
[2,] 1982 1982 1982 1982 1982
[3,] 1983 1983 1983 1983 1983
[4,] 1984 1984 1984 1984 1984
[5,] 1985 1985 1985 1985 1985
[6,] 1986 1986 1986 1986 1986

编辑:我刚刚重新阅读了您的问题,我发现您实际上正在寻找更具体的东西。您可以将上面的内容应用到使每个数据帧的子集执行类似lapply(a.out$imputations,function(x) x[x$year > 1990,]). 我不确定你想如何组合这些估算的数据集(按年分割大于/小于 1990 年),但如果你只想将所有行附加在一起rbind()就可以了(如果不让我知道你会如何喜欢,我可能会推荐一个解决方案):

> df1 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year > 1990,]))
> df2 <- do.call(rbind,lapply(a.out$imputations,function(x) x[x$year < 1990,]))
> head(df1)
        year  country  tariff polity      pop   gdp.pc intresmi   signed fiveop     usheg
imp1.11 1991 SriLanka 26.9000      5 17247000 597.6987 2.285213 1.000000   12.8 0.2589872
imp1.12 1992 SriLanka 25.0000      5 17405000 618.3329 2.877877 0.515665   13.1 0.2623017
imp1.13 1993 SriLanka 24.2000      5 17628420 652.6205 4.280361 0.000000   13.2 0.2812928
imp1.14 1994 SriLanka 26.0000      5 17865000 680.0408 4.389912 0.000000   13.2 0.2783585
imp1.15 1995 SriLanka 20.0000      5 18112000 707.6591 3.995919 0.000000   13.2 0.2627195
imp1.16 1996 SriLanka 20.5646      5 18300000 727.0039 3.676763 0.000000   13.2 0.2681700
> head(df2)
       year  country   tariff polity      pop   gdp.pc intresmi signed fiveop     usheg
imp1.1 1981 SriLanka 30.56693      6 14988000 461.0236 1.937347      0   12.4 0.2593112
imp1.2 1982 SriLanka 22.39382      5 15189000 473.7634 1.964430      0   12.5 0.2558008
imp1.3 1983 SriLanka 41.30000      5 15417000 489.2266 1.663936      1   12.3 0.2655022
imp1.4 1984 SriLanka 26.81580      5 15599000 508.1739 2.797462      0   12.3 0.2988009
imp1.5 1985 SriLanka 31.00000      5 15837000 525.5609 2.259116      0   12.3 0.2952431
imp1.6 1986 SriLanka 17.76314      5 16117000 538.9237 1.832549      0   12.5 0.2886563
于 2013-05-24T16:15:27.330 回答