r - 使用加权平均值聚合整个数据框

Question

我正在尝试使用该函数聚合数据框weighted.mean并继续出现错误。我的数据如下所示：

dat <- data.frame(date, nWords, v1, v2, v3, v4 ...)

我试过类似的东西：

aggregate(dat, by = list(dat$date), weighted.mean, w = dat$nWords)

但得到了

 Error in weighted.mean.default(X[[1L]], ...) : 
  'x' and 'w' must have the same length

还有另一个线程使用 plyr 回答了这个问题，但只有一个变量，我想以这种方式聚合我的所有变量。

score 1 · Accepted Answer

你可以用 data.table 做到这一点：

 library(data.table)

 #set up your data

 dat <- data.frame(date = c("2012-01-01","2012-01-01","2012-01-01","2013-01-01",
 "2013-01-01","2013-01-01","2014-01-01","2014-01-01","2014-01-01"), 
 nwords = 1:9, v1 = rnorm(9), v2 = rnorm(9), v3 = rnorm(9))

 #make it into a data.table

 dat = data.table(dat, key = "date")

 # grab the column names we want, generalized for V1:Vwhatever

 c = colnames(dat)[-c(1,2)]

 #get the weighted mean by date for each column

 for(n in c){
 dat[,
     n := weighted.mean(get(n), nwords),
     with = FALSE,
     by = date]
 }

 #keep only the unique dates and weighted means

 wms = unique(dat[,nwords:=NULL])

score 0 · Accepted Answer

尝试使用by：

# your numeric data
x <- 111:120

# the weights
ww <- 10:1 

mat <- cbind(x, ww)

# the group variable (in your case is 'date')
y <- c(rep("A", 7), rep("B", 3))

by(data=mat, y, weighted.mean)

如果您想要数据框中的结果，我建议使用以下plyr软件包：

plyr::ddply(data.frame(mat), "y", weighted.mean)

r - 使用加权平均值聚合整个数据框

2 回答 2

Related

Reference