我有一个数据集,它有 5ID
秒,跨度从01-01-2010
到12-31-2013
。我首先split
是数据ID
,最后是一个列表对象。然后,我创建另一个列表,该列表创建 10 天的间隔并按ID
.
我想根据间隔元素中标记ID
的 s 将这些间隔嵌套到 s的第一个列表中。ID
例如:主列表由ID
s 作为元素组成。, [1]
,是嵌套在其中的区间。例如,[2]
区间中的所有区间都是针对A,因为它是针对 B,因为它是针对 C,等等。[3]
ID
[A]
ID
[B]
[C]
[A]
[1]
[2]
[3]
[B]
[1]
[2]
[3]
[C]
[1]
[2]
[3]
[D]
[1]
[2]
[3]
[E]
[1]
[2]
[3]
下面的代码将区间嵌套到ID
列表中,但它嵌套了所有的ID
s 而不是它应该在其中的特定的。
set.seed(12345)
library(lubridate)
library(tidyverse)
date <- rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 500)
ID <- rep(c("A","B","C","D", "E"), 100)
df <- data.frame(date = date,
x = runif(length(date), min = 60000, max = 80000),
y = runif(length(date), min = 800000, max = 900000),
ID)
df_ID <- split(df, df$ID)
df_nested <- lapply(df_ID, function(x){
x %>%
arrange(ID) %>%
# Creates a new column assigning the first day in the 10-day interval in which
# the date falls under (e.g., 01-01-2010 would be in the first 10-day interval
# so the `floor_date` assigned to it would be 01-01-2010)
mutate(new = floor_date(date, "10 days")) %>%
# For any months that has 31 days, the 31st day would normally be assigned its
# own interval. The code below takes the 31st day and joins it with the
# previous interval.
mutate(new = if_else(day(new) == 31, new - days(10), new)) %>%
group_by(new, .add = TRUE) %>%
group_split()
})