5

我想传入一个日期向量,并从第二个(部分匹配的)日期向量中返回最接近的日期。

以下函数可以满足我对单个日期的要求,但是我无法弄清楚如何将其推广到searchDate日期向量的情况。

closestDate <- function(searchDate, dateList, roundDown=FALSE){
  if (roundDown) {
    dist2date <- as.Date(dateList) - as.Date(searchDate)
    closest <- which(max(dist2date[dist2date<=0]) == dist2date)
  } else {
    dist2date <- as.Date(dateList) - as.Date(searchDate)
    closest <- which(min(dist2date[dist2date>=0]) == dist2date)
  }
  return(dateList[closest])
}

dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]

closestDate('2012-12-14', oddDates)
[1] "2012-12-15"

miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100 )
closestDate(miscDatesLong, oddDates)

closestDate(miscDatesLong, oddDates)
[1] "2012-12-15" "2012-12-17" "2012-12-19"
Warning message:
In unclass(time1) - unclass(time2) :
  longer object length is not a multiple of shorter object length

有人可以帮忙吗?

4

6 回答 6

5

findInterval函数可以快速做到这一点:

dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]

oddDates[ findInterval(as.Date('2012-12-14'), oddDates)+1 ]

miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100)

oddDates[ findInterval(as.Date(miscDatesLong), oddDates) + 1 ]

要向下舍入而不是向上舍入,请删除+1. 如果您真的想找到最近的日期,而不是之前或之后的日期,您可以创建一个新的日期列表,这些日期是间隔 ( as.Date(rowMeans(embed(as.numeric(oddDates),2)), '1960-01-01')) 的中点并findInterval在这些日期上使用。有关其他选项,请参阅 的参数findInterval

于 2012-12-19T17:19:32.217 回答
4

?Vectorize

> closestDateV = Vectorize(closestDate,"searchDate")
> closestDateV(c('2012-12-15','2012-12-14'), oddDates)
2012-12-15 2012-12-14 
     15689      15689 

返回的值已删除其日期性。所以把它加回来:

> as.Date(closestDateV(c('2012-12-15','2012-12-14'), oddDates),origin="1970-01-01")
  2012-12-15   2012-12-14 
"2012-12-15" "2012-12-15" 

您可能希望将所有这些都包含在一个新函数中。

函数式编程很有趣!

于 2012-12-19T10:33:13.780 回答
3

现在,通过该示例,只需处理小于一种情况或大于另一种情况的日期子集,即当时正在检查的特定目标。

closestDt <- function(searchDate, dateList, roundDown=FALSE) 
     as.Date( sapply( searchDate , function (x) if( roundDown ){ 
                max( dateList[ dateList <= x ] ) } else {
                min( dateList[ dateList >= x])  } 
           ), "1970-01-01")
于 2012-12-19T06:36:33.560 回答
2
# initiate a tie-breaking function
tie.breaker <-
    function( x , y , la = look.after ){

        # if look.after is TRUE, eliminate all values below x
        # otherwise, eliminate all values above x
        if ( la ) y[ y < x ] <- NA else y[ y > x ] <- NA

        # then among the remaining values, figure out the date the shortest distance away
        z <- which.min( abs( x - y ) )[1]
        # use [1] to just take the first result, in case y contains duplicate dates

        # return z
        return( z )
    }

# initiate your main function
closestDate <- 
    function( searchDate , dateList , look.after = FALSE ){

        # apply a which.min( abs( ) ) command to each of the dates given, 
        # across every date in the larger list
        dist2date <- 
            sapply( 

                # on every element of searchDate..
                as.Date( searchDate ) ,

                # ..run the tie.breaker() function
                tie.breaker , 

                # and each time, pass in the dateList
                as.Date( dateList ) ,

                # and also the look.after TRUE/FALSE flag
                look.after
            )

        # return the matching dates in the same order as passed in
        dateList[ dist2date ]
    }

# try with two input dates
searchDate <- c( '2012-12-14' , '2012-11-18' )

# create a few dates to test against..
someDates <- c( '2012-11-12' ,  '2012-11-17' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' , '2012-11-20' )

# return the two dates closests to the inputted dates

# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )

# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )

# reverse the order to prove it still works
someDates <- c( '2012-11-12' , '2012-11-17' , '2012-12-13' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' )

# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )

# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )
于 2012-12-19T07:00:37.983 回答
2

您可以使用cut

nearestDate <- function(dates,datesToMatch)
{
        dtm <- sort(datesToMatch)
        dtmMid <- dtm[-length(dtm)]+diff(dtm)/2
        as.Date(cut(dates,
        breaks=c(as.Date("1970-01-01"),
        dtmMid,as.Date("2100-01-01")),labels=dtm))
}

dates1 <- as.Date(c("2012-02-14","2012-06-23","2012-08-27","2012-12-01"))
dates2 <- as.Date(c("2012-04-01","2012-10-31","2012-12-25"))
nearestDate(dates1,dates2)
[1] "2012-04-01" "2012-04-01" "2012-10-31" "2012-12-25"

请注意,我不得不为 cut 函数中的端点选择一些魔术日期,因为它不接受 +/-Inf。根据您的使用情况进行修改。

于 2012-12-19T13:06:40.450 回答
2

我认为这就是你想要的:

closestDate <- function(searchDate, dateList, roundDown=FALSE) {
  as.Date(sapply(as.Date(searchDate), function(x){
    dist <- abs(x - as.Date(dateList))
    closest <- dateList[which(min(dist) == dist)]
    return(ifelse(roundDown, min(closest), max(closest)))
  }), origin="1970-1-1")
}

sapply 是你的朋友。您只需确保返回日期而不是整数。

于 2012-12-19T14:54:22.140 回答