If I modify the NA_run
function from the question you linked to take a variable named v
instead of value
and return the boolean rather than the data.frame:
NA_run <- function(x, maxlen){
runs <- rle(is.na(x$v))
any(runs$lengths[runs$values] >= maxlen)
}
I can then write a wrapper function to call it twice for daytime and nighttime:
dropfun <- function(x) {
dt <- x$hr > 7 & x$hr < 19
daytime <- NA_run(x[dt,], 4)
nighttime <- NA_run(x[!dt,], 3)
any(daytime, nighttime)
}
Which gives me a data.frame of days to drop.
> ddply(A, .(day), dropfun)
day V1
1 1 TRUE
2 2 FALSE
>
We can alter the dropfun
to return the dataframe instead though:
dropfun <- function(x) {
dt <- x$hr > 7 & x$hr < 19
daytime <- NA_run(x[dt,], 4)
nighttime <- NA_run(x[!dt,], 3)
if(any(daytime, nighttime)) NULL else x
}
> ddply(A, .(day), dropfun)
day hr v
1 2 0 NA
2 2 1 NA
3 2 2 2.54899107
4 2 3 NA
5 2 4 -0.03476039
6 2 5 NA
7 2 6 0.65658846
8 2 7 0.95949406
9 2 8 NA
10 2 9 1.08444118
11 2 10 0.95949406
12 2 11 NA
13 2 12 -1.80603126
14 2 13 NA
15 2 14 NA
16 2 15 0.97291675
17 2 16 NA
18 2 17 NA
19 2 18 NA
20 2 19 -0.29429386
21 2 20 0.87820363
22 2 21 NA
23 2 22 0.56305582
24 2 23 -0.11028549
>