r - 使“周”功能双周

Question

嗨，这应该是一个简单的问题，我似乎无法弄清楚。我想每两周分解一次这个数据集，以便以 2 周的间隔查看年度周期。我不想总结或汇总数据。我想完全按照“周”功能正在做的事情，但每两周一次。以下是数据和代码的示例。任何帮助将不胜感激！

 DF<-dput(head(indiv))
    structure(list(event.id = 1142811808:1142811813, timestamp = structure(c(1323154800, 
    1323200450, 1323202141, 1323203545, 1323208151, 1323209966), class = c("POSIXct", 
    "POSIXt"), tzone = "UTC"), argos.altitude = c(43, 43, 39, 43, 
    44, 42), argos.best.level = c(0, -136, -128, -136, -126, -137
    ), argos.calcul.freq = c(0, 676813.1, 676802.4, 676813.1, 676810, 
    676811.8), argos.lat1 = c(43.857, 43.916, 43.87, 43.89, 43.891, 
    43.89), argos.lat2 = c(43.857, 35.141, 49.688, 35.254, 40.546, 
    54.928), argos.lc = structure(c(7L, 6L, 2L, 3L, 4L, 3L), .Label = c("0", 
    "1", "2", "3", "A", "B", "G", "Z"), class = "factor"), argos.lon1 = c(-77.244, 
    -77.326, -77.223, -77.21, -77.208, -77.21), argos.lon2 = c(-77.244, 
    -121.452, -46.86, -118.496, -94.12, -16.159), argos.nb.mes.identical = c(0L, 
    2L, 6L, 4L, 5L, 6L), argos.nopc = c(0L, 1L, 2L, 3L, 4L, 4L), 
        argos.sensor.1 = c(0L, 149L, 194L, 1L, 193L, 193L), argos.sensor.2 = c(0L, 
        220L, 216L, 1L, 216L, 212L), argos.sensor.3 = c(0L, 1L, 1L, 
        0L, 3L, 1L), argos.sensor.4 = c(0L, 1L, 5L, 1L, 5L, 5L), 
        tag.local.identifier = c(112571L, 112571L, 112571L, 112571L, 
        112571L, 112571L), utm.easting = c(319655.836066914, 313250.096346666, 
        321382.422921619, 322486.41178559, 322650.029658403, 322486.41178559
        ), utm.northing = c(4858437.89950188, 4865173.18448801, 4859836.18321128, 
        4862029.54057323, 4862136.31345349, 4862029.54057323), utm.zone = structure(c(7L, 
        7L, 7L, 7L, 7L, 7L), .Label = c("12N", "13N", "14N", "15N", 
        "16N", "17N", "18N", "19N", "20N", "22N", "39N"), class = "factor"), 
        study.timezone = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Eastern Daylight Time", 
        "Eastern Standard Time"), class = "factor"), study.local.timestamp = structure(c(1323154800, 
        1323200450, 1323202141, 1323203545, 1323208151, 1323209966
        ), class = c("POSIXct", "POSIXt"), tzone = "")), row.names = 1120:1125, class = "data.frame")

   weeknumber<-week(timestamps(DF))

score 0 · Accepted Answer

正如我在对您之前（已删除）问题的评论中所说的那样，使用and 或者or 。seq.DatecutfindInterval

我将从 2011 年 1 月 1 日开始创建一个“每隔一个星期一”的向量。这是任意的，但您需要确保选择 (1) 对您有意义的一天，(2) 开始-位于您最早数据之前的点，以及 (3)length.out=超出最新数据的点。

every_other_monday <- seq(as.Date("2011-01-03"), by = "14 days", length.out = 26)
every_other_monday
#  [1] "2011-01-03" "2011-01-17" "2011-01-31" "2011-02-14" "2011-02-28" "2011-03-14" "2011-03-28" "2011-04-11" "2011-04-25"
# [10] "2011-05-09" "2011-05-23" "2011-06-06" "2011-06-20" "2011-07-04" "2011-07-18" "2011-08-01" "2011-08-15" "2011-08-29"
# [19] "2011-09-12" "2011-09-26" "2011-10-10" "2011-10-24" "2011-11-07" "2011-11-21" "2011-12-05" "2011-12-19"
every_other_monday[ findInterval(as.Date(DF$timestamp), every_other_monday) ]
# [1] "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05" "2011-12-05"

（选择从 1 月 3 日开始的条件是假设您的真实数据跨越更长的时间。您不需要一整年的双周时间every_other_monday，也不需要是星期一，它可以是无论您选择什么基准日期。只要它至少包括实际DF日期之前和之后的一个日期，您就应该被覆盖。）

替代方案：四舍五入到周级别，然后过滤掉其儒略日模数为奇数的那些。（我选择“儒略日模数”的原因是为了减少它根据数据范围的微小变化而发生变化的机会。）

weeks <- lubridate::floor_date(as.Date(DF$timestamp), unit = "weeks")
weeks
# [1] "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04" "2011-12-04"
isodd <- as.POSIXlt(weeks)$yday %% 2 == 1
weeks[isodd] <- weeks[isodd] - 7L
weeks # technically, now "biweeks"
# [1] "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27" "2011-11-27"

score 0 · Accepted Answer

我不使用lubridate，但这是一个基本的 R 解决方案，每两周对您的数据进行子集化。我们查看作为数字模 2 的周数是否不为零，并且年周是否不重复。全部使用strftime.

res <- DF[as.numeric(strftime(DF$timestamp, "%U")) %% 2 != 0 & 
            !duplicated(strftime(DF$timestamp, "%U %y")), ]
res
#               timestamp           x
# 1   2011-12-06 01:00:00  0.73178884
# 13  2011-12-18 01:00:00 -0.19310018
# 27  2012-01-01 01:00:00  1.13017531
# 41  2012-01-15 01:00:00  1.06546084
# 55  2012-01-29 01:00:00 -0.16664011
# 69  2012-02-12 01:00:00 -1.86596108
# 83  2012-02-26 01:00:00  0.59200189
# 97  2012-03-11 01:00:00  1.08327366
# 111 2012-03-25 01:00:00 -0.71291090
# 125 2012-04-08 02:00:00  0.51984052
# 139 2012-04-22 02:00:00  0.32738506
# 153 2012-05-06 02:00:00  2.50837829
# 167 2012-05-20 02:00:00  0.75116168
# 181 2012-06-03 02:00:00 -0.56359736
# 195 2012-06-17 02:00:00  0.60658448
# 209 2012-07-01 02:00:00 -0.07242813
# 223 2012-07-15 02:00:00  0.13811301
# 237 2012-07-29 02:00:00  0.19454153
# 251 2012-08-12 02:00:00  0.23119092
# 265 2012-08-26 02:00:00 -0.97278351
# 279 2012-09-09 02:00:00 -1.18143276
# 293 2012-09-23 02:00:00 -0.43294048
# 307 2012-10-07 02:00:00  0.05664472
# 321 2012-10-21 02:00:00 -0.90725782
# 335 2012-11-04 01:00:00  0.78939068
# 349 2012-11-18 01:00:00 -0.46047924
# 363 2012-12-02 01:00:00  1.45941339

通过差异检查。

## check
diff(res$timestamp)
# Time differences in days
# [1] 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14 14
# [21] 14 14 14 14 14

数据：

DF <- data.frame(timestamp=as.POSIXct(seq(as.Date("2011-12-06"), as.Date("2012-12-06"), "day")),
                 x=rnorm(367))

score 0 · Accepted Answer

请参见下面的示例。此函数使用which.max和sapply将date变量四舍五入到两周内最近的星期日。

library(lubridate)

## Create Data Frame
DF <- data.frame(timestamp=as.POSIXct(seq(as.Date("2011-12-06"), as.Date("2012-12-06"), "day")))

## Create two week intervals (change the start date if you don't want to start on Sundays)
every_other_sunday <- seq(as.Date("2011-12-18"), by = "14 days", length.out = 27)

## Make the date variable
DF$date <- as.Date(DF$timestamp)

## Function to find the closest Sunday from the intervals created above
find_closest_sunday <- function(index){
  which.max(abs(every_other_sunday - DF$date[index] - 7) <= min(abs(every_other_sunday - DF$date[index] - 7)))
}

## Add the new variable to your dataset
DF$every_two_weeks <- every_other_sunday[sapply(seq_along(DF$date), function(i) find_closest_sunday(i))]

## Check that the function worked correctly
DF[,c("date", "every_two_weeks")]

## If you want the week number instead of a date, wrap the every_two_weeks variable in the week() function

week(DF$every_two_weeks)

r - 使“周”功能双周

3 回答 3

Related

Reference