3

我想创建一个函数,可以计算可变数量的最后一次观察和不同变量的移动平均值。将此作为模拟数据:

df = expand.grid(site = factor(seq(10)),
                 year = 2000:2004,
                 day = 1:50)
df$temp = rpois(dim(df)[1], 5) 

计算 1 个变量和固定数量的最后观察值是可行的。例如,这计算过去 5 天的平均温度:

library(dplyr)
library(zoo)

df <- df %>% 
            group_by(site, year) %>% 
                arrange(site, year, day) %>% 
                      mutate(almost_avg = rollmean(x = temp, 5, align = "right", fill = NA)) %>%
                          mutate(avg = lag(almost_avg, 1))

到目前为止,一切都很好。现在尝试功能化失败。

avg_last_x <- function(dataframe, column, last_x) {

  dataframe <- dataframe %>% 
    group_by(site, year) %>% 
      arrange(site, year, day) %>% 
        mutate(almost_avg = rollmean(x = column, k = last_x, align = "right", fill = NA)) %>%
          mutate(avg = lag(almost_avg, 1))

  return(dataframe) }

avg_last_x(dataframe = df, column = "temp", last_x = 10)

我收到此错误:

Error in mutate_impl(.data, dots) : k <= n is not TRUE 

我知道这可能与dplyr 中的评估机制有关,但我没有得到修复。

在此先感谢您的帮助。

4

1 回答 1

6

这应该解决它。

library(lazyeval)

avg_last_x <- function(dataframe, column, last_x) {
  dataframe %>% 
    group_by(site, year) %>% 
    arrange(site, year, day) %>% 
    mutate_(almost_avg = interp(~rollmean(x = c, k = last_x, align = "right", 
                                          fill = NA), c = as.name(column)),
            avg = ~lag(almost_avg, 1))
}
于 2017-01-07T13:18:21.330 回答