我正在处理时间序列数据。数据集是:
datALL <- read.table(header=TRUE, text="
ID Year Align
A01 2017 329
A01 2016 NA
A01 2015 NA
A01 2014 314
A01 2013 NA
A01 2012 NA
A01 2011 432
A02 2017 4536
A02 2016 NA
A02 2015 NA
A02 2014 2345
A02 2013 NA
A02 2012 NA
A02 2011 1932
")
datALL
ID Year Align
1 A01 2017 329
2 A01 2016 NA
3 A01 2015 NA
4 A01 2014 314
5 A01 2013 NA
6 A01 2012 NA
7 A01 2011 432
8 A02 2017 4536
9 A02 2016 NA
10 A02 2015 NA
11 A02 2014 2345
12 A02 2013 NA
13 A02 2012 NA
14 A02 2011 1932
imputeTS
我想使用包来估算缺失值。该软件包适用于个人ID
。
datA01 <- read.table(header=TRUE, text="
ID Year Align
A01 2017 329
A01 2016 NA
A01 2015 NA
A01 2014 314
A01 2013 NA
A01 2012 NA
A01 2011 432
")
datA01
ID Year Align
1 A01 2017 329
2 A01 2016 NA
3 A01 2015 NA
4 A01 2014 314
5 A01 2013 NA
6 A01 2012 NA
7 A01 2011 432
### install.packages("imputeTS")
library(imputeTS)
datA01$Year <- ts(datA01[, c(2)])
datA01$Align1 <- na_kalman(datA01$Align)
dat1
ID Year Align Align1
1 A01 2017 329 329.0000
2 A01 2016 NA 318.9847
3 A01 2015 NA 312.7852
4 A01 2014 314 314.0000
5 A01 2013 NA 347.2150
6 A01 2012 NA 387.7720
7 A01 2011 432 432.0000
因为A02
它也很完美:
datA02 <- read.table(header=TRUE, text="
ID Year Align
A02 2017 4536
A02 2016 NA
A02 2015 NA
A02 2014 2345
A02 2013 NA
A02 2012 NA
A02 2011 1932
")
datA02$Year <- ts(datA02[, c(2)])
datA02$Align1 <- na_kalman(datA02$Align)
datA02
ID Year Align Align1
1 A02 2017 4536 4536.000
2 A02 2016 NA 3510.613
3 A02 2015 NA 3168.817
4 A02 2014 2345 2345.000
5 A02 2013 NA 2485.226
6 A02 2012 NA 2143.431
7 A02 2011 1932 1932.000
对于所有数据,它不会起作用,因为它需要所有 14 年作为一个连续的时间序列。根据ID
. 我需要帮助来获得一个可以解决这个问题的循环功能。
datALL$Year <- ts(datALL[, c(2)])
datALL$Align1 <- na_kalman(datALL$Align)
#### WRONG IMPUTATION DUE TO FAILUE IN SEPARATING YEARS BY ID
datALL
ID Year Align Align1
1 A01 2017 329 329.0000
2 A01 2016 NA 808.8287
3 A01 2015 NA 968.7716
4 A01 2014 314 314.0000
5 A01 2013 NA 1288.6573
6 A01 2012 NA 1448.6002
7 A01 2011 432 432.0000
8 A02 2017 4536 4536.0000
9 A02 2016 NA 1928.4289
10 A02 2015 NA 2088.3718
11 A02 2014 2345 2345.0000
12 A02 2013 NA 2408.2575
13 A02 2012 NA 2568.2004
14 A02 2017 1932 1932.0000
正确的数据应该是这样的
ID Year Align Align1
1 A01 2017 329 329.0000
2 A01 2016 NA 318.9847
3 A01 2015 NA 312.7852
4 A01 2014 314 314.0000
5 A01 2013 NA 347.2150
6 A01 2012 NA 387.7720
7 A01 2011 432 432.0000
8 A02 2017 4536 4536.000
9 A02 2016 NA 3510.613
10 A02 2015 NA 3168.817
11 A02 2014 2345 2345.000
12 A02 2013 NA 2485.226
13 A02 2012 NA 2143.431
14 A02 2011 1932 1932.000