这是一个列表。
1 2 3 4 5 6 7 8 9 10.
我想计算每三个连续元素的平均值。例如,输出将是 NA NA 2 3 4 5 6 7 8 9。
这个怎么做?
问候
您可以?embed
尝试?rowMeans
:
v <- 1:10
m <- embed(v, 3)
m
# [,1] [,2] [,3]
#[1,] 3 2 1
#[2,] 4 3 2
#[3,] 5 4 3
#[4,] 6 5 4
#[5,] 7 6 5
#[6,] 8 7 6
#[7,] 9 8 7
#[8,] 10 9 8
rowMeans(m)
# 2 3 4 5 6 7 8 9
编辑:另一个解决方案是?filter
:
filter(x=v, filter=rep(1/3, 3), sides=1)
# Time Series:
# Start = 1
# End = 10
# Frequency = 1
# [1] NA NA 2 3 4 5 6 7 8 9
你甚至可以使用rollapply
或者rollmean
来自 zoo 包
> library(zoo)
> v <- 1:10
> rollapply(v, width=3, align="right", FUN=mean, fill=NA )
[1] NA NA 2 3 4 5 6 7 8 9
> rollmean(v, k=3, align="right", fill=NA )
[1] NA NA 2 3 4 5 6 7 8 9
This is fourth way, using the lag
function:
v <- 1:10
rowMeans(do.call(cbind, lapply(0:2, lag, x=as.ts(v))))
# [1] NA NA 2 3 4 5 6 7 8 9 NA NA
You can wrap this in na.omit
to remove the NA
s.
Benchmarks
library(microbenchmark)
library(zoo)
v <- 1:10000
f.embed <- function() rowMeans(embed(v, 3))
f.filter <- function() filter(x=v, filter=rep(1/3, 3), sides=1)
f.lag <- function() rowMeans(do.call(cbind, lapply(0:2, lag, x=as.ts(v))))
f.rollmean <-function() rollapply(v, width=3, align="right", FUN=mean, fill=NA )
microbenchmark(f.embed(), f.filter(), f.lag(), f.rollmean())
# Unit: microseconds
# expr min lq median uq max neval
# f.embed() 486.7 499.8 505.6 517.1 1633.1 100
# f.filter() 285.3 300.7 307.2 316.6 912.5 100
# f.lag() 1601.6 1640.9 1677.0 2188.3 2838.7 100
# f.rollmean() 4265.4 4853.5 4902.0 5364.8 52098.9 100