3

我有以下示例:

Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"), 
             to = as.POSIXct("2010-10-10 22:00"), by = 3600)
Dat <- data.frame(DateTime = Date1,
                  t = rnorm(length(Date1)))

我想找到给定日期的值范围(即最大值 - 最小值)。

首先,我定义了附加列,这些列根据日期和一年中的某一天 (doy) 定义了唯一的日期。

Dat$date <- format(Dat$DateTime, format = "%Y-%m-%d") # find the unique days
Dat$doy <- as.numeric(format(Dat$DateTime, format="%j")) # find the unique days

然后找到我尝试过的范围

by(Dat$t, Dat$doy, function(x) range(x))

但这会将范围作为两个值而不是单个值返回,所以,我的问题是,我如何找到每天的计算范围并将它们返回到具有

new_data <- data.frame(date = unique(Dat$date),
                       range = ...)

任何人都可以建议这样做的方法吗?

4

2 回答 2

2
# Use the data.table package
require(data.table)

# Set seed so data is reproducible 
set.seed(42)

# Create data.table
Date1 <- seq(from = as.POSIXct("2010-05-01 02:00"), to = as.POSIXct("2010-10-10 22:00"), by = 3600)
DT <- data.table(date = as.IDate(Date1), t = rnorm(length(Date1)))

# Set key on data.table so that it is sorted by date
setkey(DT, "date")

# Make a new data.table with the required information (can be used as a data.frame)
new_data <- DT[, diff(range(t)), by = date]

#            date       V1
# 1:   2010-05-01 4.943101
# 2:   2010-05-02 4.309401
# 3:   2010-05-03 4.568818
# 4:   2010-05-04 2.707036
# 5:   2010-05-05 4.362990
# ---                    
# 159: 2010-10-06 2.659115
# 160: 2010-10-07 5.820803
# 161: 2010-10-08 4.516654
# 162: 2010-10-09 4.010017
# 163: 2010-10-10 3.311408
于 2013-09-01T11:15:56.143 回答
2

我倾向于使用tapply这种东西。ave有时也很有用。这里:

> dr = tapply(Dat$t,Dat$doy,function(x){diff(range(x))})

总是检查棘手的东西:

> dr[1]
     121 
3.084317 
> diff(range(Dat$t[Dat$doy==121]))
[1] 3.084317

使用 names 属性获取日期和值以创建数据框:

> new_data = data.frame(date=names(dr),range=dr)
> head(new_data)
    date    range
121  121 3.084317
122  122 4.204053

是否要将数字日期转换回日期对象?

于 2013-09-01T11:06:47.207 回答