-2

I have a large dataset...60k records covering 63 years of data.

I need to produce a plot of mean number of events per day over the 63 year period, resulting in a dataframe like:

Date Frequency
DDMM    5.2

First question - How can I convert the dates DD/MM/YYYY to DD/MM to allow for grouping

Second: What function is best to produce an average value for that day over the time span of the data set.

I have looked at aggregate and cumsum but fail miserably as I have not been able to group by DDMM and Mean Value.

Update:

 esums <- with(TorData,
          tapply(Count, 
                 format( as.Date(Date, "%d/%m/%Y"), "%d/%m"), 
                 sum, na.rm=TRUE) )

Data<-esums/63

Results looks like this:

    01/01     01/02     01/03     01/04     01/05     01/06     01/07     01/08 
    0.4444444 0.6190476 2.1428571 1.8095238 4.9365079 5.4920635 4.0000000 1.7301587 
    01/09     01/10     01/11     01/12     02/01     02/02     02/03     02/04 
    1.4444444 1.1904762 0.9206349 0.4126984 0.8412698 0.7936508 2.3015873 4.9206349 
    02/05     02/06     02/07     02/08     02/09     02/10     02/11     02/12 
    4.7936508 6.4920635 3.8888889 2.0317460 1.5714286 0.7936508 0.4603175 1.0634921 

Convert to Dataframe

    Data<-as.data.frame(Data)

The data is now on an array, and needs converting to a data frame?

       Data
 01/01 0.4444444
 01/02 0.6190476
 01/03 2.1428571
 01/04 1.8095238
 01/05 4.9365079
 01/06 5.4920635
 01/07 4.0000000
 01/08 1.7301587
 01/09 1.4444444
 01/10 1.1904762
 01/11 0.9206349
 01/12 0.4126984

What I require for the lineplot would be 2 columns one with Date and the other with Mean, Date appears to have lost its data type?

4

1 回答 1

1

尝试这样的事情(在没有可重复示例的情况下未经测试):

 esums <- with(my_dataframe,
            tapply(event_count, 
                   format( as.Date(my_dates, "%d/%m/%Y"), "%d/%m"), 
                   sum, na.rm=TRUE) )
 enums <-  with(my_dataframe,
            tapply(event_count[!is.na(event_count)], 
                   format( as.Date(my_dates, "%d/%m/%Y"), "%d/%m"), 
                   sum, na.rm=TRUE) )
 mean_by_day_of_year <- esums/enums

您创建的数据框具有因子值(因为没有年份,它们不是真正的日期,并且没有一年中的某一天数据类型并将as.data.framed/m 转换为行名。)然后您可以使用序列索引将其绘制为线图作为 x 值,设置 xaxt="n",然后用 绘制信息标签axis(1, ...)

   dat <- read.table(text=    "Data
 01/01 0.4444444
  01/02 0.6190476
  01/03 2.1428571
  01/04 1.8095238
  01/05 4.9365079
  01/06 5.4920635
  01/07 4.0000000
  01/08 1.7301587
  01/09 1.4444444
  01/10 1.1904762
  01/11 0.9206349
  01/12 0.4126984", header=TRUE)
 plot(dat$Data, xaxt="n")
 axis(1, at=1:nrow(dat), labels=rownames(dat), las=2)
 png(); plot(dat$Data, xaxt="n")
        axis(1, at=1:nrow(dat), labels=rownames(dat), las=2)
 dev.off()

>

于 2014-07-18T17:04:56.887 回答