r - Delete data with gaps

Question

I want to delete data with gaps between the max and min time period corresponding to an individual id. Each Id can start and end in any time period, that is fine. I just want to grab ids that do not have missing time within the max and min time.

library(data.table)
set.seed(5)
data<-data.table(y=rnorm(100))
data[sample(1:100, 40),]<-NA
id = rep(1:10, each = 10)
time = seq(1,10)
data2<-data.frame(id,time)
data2$row<-1:nrow(data2)
data2a<-subset(data2,row<55|row>61 )
data3<-data2a[-sample(nrow(data2a), 5),]
data.table(data3)
count(data3$id)

Here is a good example. Group 1 should be deleted, but not 6 for example.

score 2 · Accepted Answer

您要过滤的条件是没有大于 1diff(time)的间隙。为您提供间隙，因此请all(diff(time) == 1)检查条件。

因此，您可以这样做：

library(dplyr)
data3 %>%
    group_by(id) %>%
    filter(all(diff(time) == 1))

在 data.table 中，一种解决方案（做同样的事情）是：

setDT(data3)[, .SD[all(diff(time) == 1)], id]

score 0 · Accepted Answer

使用dplyr：

library(dplyr)
data3 %>% group_by(id) %>%
          filter(identical(time, seq(first(time), last(time))))

r - Delete data with gaps

2 回答 2

Related

Reference