这似乎非常适合data.table
andlapply(.SD,FUN)
和.SDcols
论点
.SD
是一个 data.table,其中包含每个组的 x 数据子集,不包括组列。
.SDcols
是一个向量,包含您希望应用函数的列的名称 ( FUN
)
一个例子
设置data.table
library(data.table)
DT <- as.data.table(df)
x
, y
,z
列的总和f
DT[, lapply(.SD, sum), by = f, .SDcols = c("x", "y", "z")]
## f x y z
## 1: 4 4.8041 3.9788 1.2519
## 2: 2 1.1255 -0.8147 2.9053
## 3: 3 0.9699 -0.1550 -8.5876
## 4: 9 2.2685 -1.2734 1.0506
## 5: 5 -0.1282 -2.5512 5.0668
## 6: 10 -2.7397 0.5290 -0.3638
## 7: 1 2.9544 -3.1139 -1.3884
## 8: 8 -4.3488 0.6894 1.4195
## 9: 7 2.3152 0.6474 2.7183
## 10: 6 -0.1569 1.0142 0.9156
x
和z
列的总和f
DT[, lapply(.SD, sum), by = f, .SDcols = c("x", "z")]
## f x z
## 1: 4 4.8041 1.2519
## 2: 2 1.1255 2.9053
## 3: 3 0.9699 -8.5876
## 4: 9 2.2685 1.0506
## 5: 5 -0.1282 5.0668
## 6: 10 -2.7397 -0.3638
## 7: 1 2.9544 -1.3884
## 8: 8 -4.3488 1.4195
## 9: 7 2.3152 2.7183
## 10: 6 -0.1569 0.9156
计算平均值的例子
DT[, lapply(.SD, mean), by = f, .SDcols = c("x", "y", "z")]
## f x y z
## 1: 4 0.36955 0.30606 0.09630
## 2: 2 0.10232 -0.07407 0.26412
## 3: 3 0.07461 -0.01193 -0.66059
## 4: 9 0.15123 -0.08489 0.07004
## 5: 5 -0.01425 -0.28346 0.56298
## 6: 10 -0.21075 0.04069 -0.02799
## 7: 1 0.29544 -0.31139 -0.13884
## 8: 8 -0.54360 0.08617 0.17744
## 9: 7 0.38586 0.10790 0.45305
## 10: 6 -0.07844 0.50710 0.45782
DT[, lapply(.SD, mean), by = f, .SDcols = c("x", "z")]
## f x z
## 1: 4 0.36955 0.09630
## 2: 2 0.10232 0.26412
## 3: 3 0.07461 -0.66059
## 4: 9 0.15123 0.07004
## 5: 5 -0.01425 0.56298
## 6: 10 -0.21075 -0.02799
## 7: 1 0.29544 -0.13884
## 8: 8 -0.54360 0.17744
## 9: 7 0.38586 0.45305
## 10: 6 -0.07844 0.45782