1

I'm working with a data set and am imputing NAs for times. I have a simplified example below where I am creating a new column that includes the original data and imputed values for NAs (i.e., the mean of the time of day). The code works fine, but I am so weak with dates I was wondering if there was an easier way to calculate the mean time of day date/time values?

arrivals <- data.frame(
  ships=c("Glory","Discover","Intrepid","Enchantment","Summit"),
  times=c("8:00","10:00","11:42",NA,"9:20"), stringsAsFactors=FALSE)
sumtime <- sapply(strsplit(as.character(arrivals$times),":"),
  function(x) as.numeric(x[1])*60 + as.numeric(x[2]))
avgtime <- paste(trunc((mean(sumtime, na.rm=TRUE)/60)),":",
  trunc(mean(sumtime, na.rm=TRUE)%%60), sep="")
arrivals$times2 <- arrivals$times
arrivals$times2[is.na(arrivals$times)] <- avgtime
4

1 回答 1

2

您可以使用 chron 包将times列转换为可以取平均值的数字表示:

library(chron)
Arrivals <- arrivals[,c("ships","times")]
# Will give some warnings due to the missing value
Arrivals$times <- chron(times.=paste(Arrivals$times, ":00", sep=""))
Arrivals$times[is.na(Arrivals$times)] <- mean(Arrivals$times,na.rm=TRUE)
        ships    times
1       Glory 08:00:00
2    Discover 10:00:00
3    Intrepid 11:42:00
4 Enchantment 09:45:30
5      Summit 09:20:00
于 2012-06-12T22:29:50.377 回答