-1

I have a data set with 41 rows and 21 columns. In DF, each row represents energy data in 15 minute interval of the day (from 10am-8pm). each column represents selected days within a month month.

I need to figure out load variability (standard deviation/ mean) b/w two lines in each column using the following equation.

http://i.stack.imgur.com/inOKV.jpg

I.e, between the 1st and 2nd; 1st, 2nd and 3rd; 1st-4th; 1st-5th; etc. element of each column.

I keep getting NA values in "lv" and wonder why. The end result, lv should have a dataframe of 41x21, same as df but with load variability.

Also, how do I also get 2.5 and 97.5 percentiles within the loop other than load variability?

x <- df[1:41,1:21]

#calculate load variability 
count = 0
i=1{
for (i in 1:41){
     count = count+1  
     mean = sum (x[1:l,])/count
     diff = ((x-mean)^2)
     lv= sqrt((diff/(count+1)-1)/mean)
         i = i+1
  }
}
lv

lv ends up with null values (NA).

4

2 回答 2

3

If you want to calculate sd/mean for each row, try:

apply(x, 1, sd)/rowMeans(x)

If you want the 2.5% and 97.5% confidence level for each row try:

apply(x, 1, quantile, c(.025, 0.975))
于 2012-08-14T13:18:43.680 回答
1

好的,经过几次尝试(以及这个问题的一些帮助),我终于有了:

cumul_loading <- function(x, leave.nan=FALSE){
  ind_na <- !is.na(x)
  nn <- cumsum(ind_na)
  x[!ind_na] <- 0

  cumul_mean <- cumsum(x) / nn
  cumul_sd <- sqrt(cumsum(x^2) / (nn-1) - (cumsum(x))^2/(nn-1)/nn)

  if(leave.nan) return(cumul_sd / cumul_mean) else 
    return((cumul_sd / cumul_mean)[-1])
}

它应该有一些错误(例如如何处理 NA),但它现在应该可以与apply函数一起使用。leave.nan参数可选地留下产生的 NaN时n_len - 1 = 0

apply(x, 2, cumul_loading)
于 2012-08-14T14:10:13.323 回答