可用的大多数模型{fable}
都要求观察结果是规则的,并且许多模型还要求数据中没有间隙。支持不规则数据的示例模型是fable::TSLM()
.
上面的示例数据通常被认为是“常规的”,但有差距。这是因为数据的共同区间为1 month
,但数据中缺少某些月份。以下是如何生成此数据的 tsibble:
DF <- structure(list(station = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L), Time = structure(c(1L, 2L, 3L, 5L, 7L, 1L, 2L, 4L, 6L, 8L
), .Label = c("01-01-1974", "01-02-1974", "01-03-1974", "01-04-1974",
"01-05-1974", "01-06-1974", "01-07-1974", "01-08-1974"), class = "factor"),
WaterTemp = c(5, 5, 8.6000004, 8.1333332, 12.7999999, 5,
5, 8.6000004, 8.1333332, 12.7999999)), .Names = c("station",
"Time", "WaterTemp"), class = "data.frame", row.names = c(NA,
-10L))
# Fix $Time to a valid yearmonth index variable
library(tsibble)
library(dplyr)
DF <- DF %>%
mutate(Time = yearmonth(as.Date(format(Time), format = "%d-%m-%Y")))
DF
#> station Time WaterTemp
#> 1 1 1974 Jan 5.000000
#> 2 1 1974 Feb 5.000000
#> 3 1 1974 Mar 8.600000
#> 4 1 1974 May 8.133333
#> 5 1 1974 Jul 12.800000
#> 6 2 1974 Jan 5.000000
#> 7 2 1974 Feb 5.000000
#> 8 2 1974 Apr 8.600000
#> 9 2 1974 Jun 8.133333
#> 10 2 1974 Aug 12.800000
# Create a 'regular' tsibble (with gaps)
as_tsibble(DF, key = "station", index = "Time")
#> # A tsibble: 10 x 3 [1M]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 May 8.13
#> 5 1 1974 Jul 12.8
#> 6 2 1974 Jan 5
#> 7 2 1974 Feb 5
#> 8 2 1974 Apr 8.60
#> 9 2 1974 Jun 8.13
#> 10 2 1974 Aug 12.8
要填补此数据集的空白 - 类似于链接问题中显示的内容- 您可以使用该tsibble::fill_gaps()
功能。这使得数据与支持缺失值的模型兼容,但不支持数据中的间隙,例如fable::ARIMA()
.
# Create a 'regular' tsibble (with gaps) then complete the gaps
as_tsibble(DF, key = "station", index = "Time") %>%
fill_gaps()
#> # A tsibble: 15 x 3 [1M]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 Apr NA
#> 5 1 1974 May 8.13
#> 6 1 1974 Jun NA
#> 7 1 1974 Jul 12.8
#> 8 2 1974 Jan 5
#> 9 2 1974 Feb 5
#> 10 2 1974 Mar NA
#> 11 2 1974 Apr 8.60
#> 12 2 1974 May NA
#> 13 2 1974 Jun 8.13
#> 14 2 1974 Jul NA
#> 15 2 1974 Aug 12.8
可以使用创建不规则的时间序列regular = FALSE
。如果您正在处理事件数据,这通常很有用。在这种情况下,您很少想填补空白,因为有很多。
# Create an 'irregular' tsibble (no concept of gaps)
as_tsibble(DF, key = "station", index = "Time", regular = FALSE)
#> # A tsibble: 10 x 3 [!]
#> # Key: station [2]
#> station Time WaterTemp
#> <int> <mth> <dbl>
#> 1 1 1974 Jan 5
#> 2 1 1974 Feb 5
#> 3 1 1974 Mar 8.60
#> 4 1 1974 May 8.13
#> 5 1 1974 Jul 12.8
#> 6 2 1974 Jan 5
#> 7 2 1974 Feb 5
#> 8 2 1974 Apr 8.60
#> 9 2 1974 Jun 8.13
#> 10 2 1974 Aug 12.8
由reprex 包于 2021-02-09 创建(v0.3.0)