3

我有一个大型数据集,如下所示:

Time,Volume    
1996-02-05 00:34:00,0.01
1996-02-05 00:51:00,0.01
1996-02-05 00:52:00,0.01
1996-02-05 01:04:00,0.01
1996-02-05 01:19:00,0.01
1996-02-05 05:00:00,0.01
1996-02-05 05:07:00,0.01
1996-02-05 05:08:00,0.01
1996-02-05 05:14:00,0.01

我想Volume对每 30 分钟间隔的列求和。这是我尝试过的:

z <- read.zoo("precip.csv", header = TRUE, sep = ",", FUN = as.chron)
half_hour <- period.apply(z, endpoints(z, "minutes", 30), length)

哪个返回:

Time,Volume
02/05/96 00:52:00,3
02/05/96 01:19:00,2
02/05/96 05:14:00,4

我试图让输出看起来像:

Time,Volume
02/05/96 00:29:00,0
02/05/96 00:59:00,3
02/05/96 01:29:00,2
02/05/96 01:59:00,0
02/05/96 02:29:00,0
02/05/96 02:59:00,0

...等等。

或者,我认为如果我可以填写原始数据集以便计算每一分钟(缺失Volumes等于 0),它会起作用。

我找到了这篇文章,但无法使它工作。

> z_xts<- xts(precip[,c("Volume")],precip[,"Time"])
Error in xts(precip[, c("Volume")], precip[, "Time"]) : 
  order.by requires an appropriate time-based object
4

2 回答 2

3

这应该做你想要的:

library(xts)
x <- as.xts(read.zoo(text="Time,Volume    
1996-02-05 00:34:00,0.01
1996-02-05 00:51:00,0.01
1996-02-05 00:52:00,0.01
1996-02-05 01:04:00,0.01
1996-02-05 01:19:00,0.01
1996-02-05 05:00:00,0.01
1996-02-05 05:07:00,0.01
1996-02-05 05:08:00,0.01
1996-02-05 05:14:00,0.01",
sep=",", FUN=as.POSIXct, header=TRUE, drop=FALSE))

# 1) Create POSIXct sequence from midnight of the first day
#    until the end of the last day    
midnightDay1 <- as.POSIXct(format(start(x),"%Y-%m-%d"))
timesteps <- seq(midnightDay1, end(x), by="30 min")
# 2) Make a copy of your object and set all values for Volume to 1
y <- x
y$Volume <- 1
# 3) Merge the copy with a zero-column xts object that has an index
#    with all the values you want.  Fill missing values with 0.
m <- merge(y, xts(,timesteps), fill=0)
# 4) Align all index values to 30-minute intervals
a <- align.time(m, 60*30)
# 5) Sum the values for Volume in each period
half_hour <- period.apply(a, endpoints(a, "minutes", 30), sum)
于 2013-04-10T17:03:22.227 回答
0

我对上面提到的步骤 3) 有点困惑,所以我所做的是:

library("lubridate")
library("xts")
my_data <- read.csv("my_data.csv", stringsAsFactors=FALSE, sep=",", 
header=T) 
colnames(my_data) <- c("Time", "PAR", "NDVI", "LWS")
#It is easier if you subset your data
my_data_short_short <- subset(my_data, select = c("Time", "NDVI")) 
my_data_short$Time <- ymd_hm(my_data_short$Time, tz="UTC") 
beginning <- as.POSIXct("2016-05-12 00:00",format = "%Y-%m-%d %H:%M", 
tz="UTC")
end <- as.POSIXct("2016-06-05 00:00",format = "%Y-%m-%d %H:%M", tz="UTC")
timesteps <- seq(beginning, end, by="5 min")
volume <- rep_len(1, length.out=length(timesteps))
time_series <- data.frame(timesteps, volum)
merge <- merge(time_series, my_data_short, by.x= "timesteps", by.y="Time", 
all.x=TRUE, all.y = FALSE)

#This formats your data to run the package xts
my_data_brief.xts <- xts(x= merge$NDVI, order.by=merge$timesteps, frequency 
= 1, tzone="UTC") 

# Align all index values to 30-minute intervals
a <- align.time(my_data_brief.xts, 60*30)
# 5) Sum the values for Volume in each period
result <- period.apply(a, endpoints(a, "minutes", 30), sum, na.rm=TRUE)

saveRDS (result, file="result.rds")
于 2017-04-28T16:04:58.730 回答