首先是我创建的解决方案 - 我是新手,所以欢迎所有帮助改进我的功能:
library(xts)
library(timeSeries)
#' Fills the gaps of timeseries (timeSeries, xts) - ignores trailing and leading NAs.
#'
#' Suppose a timeseries object that looks as follows:
#'
#' 2017-01-01 1 10 NA NA NA 100000 1000000 NA
#' 2017-01-02 2 NA 200 2000 20000 200000 2000000 NA
#' 2017-01-03 3 NA 300 3000 NA NA 3000000 30000000
#' 2017-01-04 4 40 400 4000 40000 400000 4000000 NA
#' 2017-01-05 5 50 500 NA NA NA NA NA
#'
#' Leading and trailing NAs will stay in place, whereas NAs
#' within the "data section" should be written forward (na.locf).
#' The result of the function call would be:
#'
#' 2017-01-01 1 10 NA NA NA 100000 1000000 NA
#' 2017-01-02 2 10 200 2000 20000 200000 2000000 NA
#' 2017-01-03 3 10 300 3000 20000 200000 3000000 30000000
#' 2017-01-04 4 40 400 4000 40000 400000 4000000 NA
#' 2017-01-05 5 50 500 NA NA NA NA NA
#'
#' @param ts_obj xts or timeSeries object to fill
#'
#' @return timeSeries, xts - depending on handed in type
#' @export
#'
#' @examples
#' library(xts)
#' library(timeSeries)
#' test_matrix <- cbind(
#' c( 1, 2, 3, 4, 5),
#' c( 10, NA, NA, 40, 50),
#' c( NA, 200, 300, 400, 500),
#' c( NA, 2000, 3000, 4000, NA),
#' c( NA, 20000, NA, 40000, NA),
#' c( 100000, 200000, NA, 400000, NA),
#' c( 1000000, 2000000, 3000000, 4000000, NA),
#' c( NA, NA, 30000000, NA, NA)
#' )
#' dates <- as.Date('2017-01-01') + 0:4
#' test_xts <- xts(test_matrix, dates)
#' print(test_xts)
#' print(fill_ts(test_xts))
#'
#' test_ts = as.timeSeries(test_xts)
#' print(fill_ts(test_ts))
fill_ts <- function(ts_obj) {
# Fill from first date --> the FIRST dates will remain NA (if they were NA)
filled_from_first_date <- na.locf(ts_obj, fromLast=FALSE)
# Fill from last date --> the LAST dates will remain NA (if they were NA)
filled_from_last_date <- na.locf(ts_obj, fromLast=TRUE)
# replace value with NA if NA is found in one of the filled timeseries
filled_from_first_date[is.na(filled_from_first_date) | is.na(filled_from_last_date)] <- NA
return(filled_from_first_date)
}
test_matrix <- cbind(
c( 1, 2, 3, 4, 5),
c( 10, NA, NA, 40, 50),
c( NA, 200, 300, 400, 500),
c( NA, 2000, 3000, 4000, NA),
c( NA, 20000, NA, 40000, NA),
c( 100000, 200000, NA, 400000, NA),
c( 1000000, 2000000, 3000000, 4000000, NA),
c( NA, NA, 30000000, NA, NA)
)
dates <- as.Date('2017-01-01') + 0:4
test_xts <- xts(test_matrix, dates)
print(test_xts)
result = fill_ts(test_xts)
print(result)
test_ts = as.timeSeries(test_xts)
result = fill_ts(test_ts)
print(result)
此函数填充 (xts, timeSeries) 时间序列并忽略尾随和前导 NA。该功能甚至相当快 - 但仍然:这是一个标准问题,我敢肯定,有一个标准(希望更有效)的解决方案,我没有找到。
抱歉,如果这个问题被问到并回答了 1000 倍……我在 stackoverflow 上找不到合适的条目。