3

I am working with a dataset that has temperature readings once an hour, 24 hrs a day for 100+ years. I want to get an average temperature for each day to reduce the size of my dataset. The headings look like this:

     YR MO DA HR MN TEMP
  1943  6 19 10  0   73
  1943  6 19 11  0   72
  1943  6 19 12  0   76
  1943  6 19 13  0   78
  1943  6 19 14  0   81
  1943  6 19 15  0   85
  1943  6 19 16  0   85
  1943  6 19 17  0   86
  1943  6 19 18  0   86
  1943  6 19 19  0   87

etc for 600,000+ data points.

How can I run a nested function to calculate daily average temperature so i preserve the YR, MO, DA, TEMP? Once I have this, I want to be able to look at long term averages & calculate say the average temperature for the Month of January across 30 years. How do I do this?

4

3 回答 3

10

在一个步骤中,您可以这样做:

 meanTbl <- with(datfrm, tapply(TEMP, ISOdate(YR, MO, DA), mean) )

这为您提供了日期时间格式的索引以及值。如果您只想要 Date 作为字符而没有尾随时间:

meanTbl <- with(dat, tapply(TEMP, as.Date(ISOdate(YR, MO, DA)), mean) )

每月平均值可以通过以下方式完成:

 monMeans <- with(meanTbl, tapply(TEMP, MO, mean))
于 2013-02-27T06:44:58.783 回答
6

你可以这样做aggregate

# daily means
aggregate(TEMP ~ YR + MO + DA, FUN=mean, data=data) 

# monthly means 
aggregate(TEMP ~ YR + MO, FUN=mean, data=data)

# yearly means
aggregate(TEMP ~ YR, FUN=mean, data=data)

# monthly means independent of year
aggregate(TEMP ~ MO, FUN=mean, data=data)
于 2013-02-27T06:54:45.260 回答
2

您的第一个问题可以使用plyr包来解决:

library(plyr)
daily_mean = ddply(df, .(YR, MO, DA), summarise, mean_temp = mean(TEMP))

与上述解决方案类似,获取月度意味着:

monthly_mean = ddply(df, .(YR, MO), summarise, mean_temp = mean(temp))

或获取整个数据集的月平均值(30 年,也就是气候的正常值),而不是每年:

monthly_mean_normals = ddply(df, .(MO), summarise, mean_temp = mean(temp))
于 2013-02-27T06:45:05.507 回答