我有一个包含 50 多列的数据框,我需要向其中添加几列,每一列都将包含一些原始列的平均值。
列组具有不同的长度,列名不包含任何标识它们所属组的元素。
对于如何有效地做到这一点的任何建议,我将不胜感激。
请在此简化示例中进行演示:数据框 tb 中的列组将是 (a,b)(c,d,e)(f,g),定义为:
tb<-data.frame(a=runif(10)*100,b=runif(10)*100,c=runif(10)*100,
d=runif(10)*100,e=runif(10)*100,f=runif(10)*100,g=runif(10)*100)
tb
a b c d e f g
1 26.550866 20.59746 93.470523 48.20801 82.09463 47.761962 91.287592
2 37.212390 17.65568 21.214252 59.95658 64.70602 86.120948 29.360337
3 57.285336 68.70228 65.167377 49.35413 78.29328 43.809711 45.906573
4 90.820779 38.41037 12.555510 18.62176 55.30363 24.479728 33.239467
5 20.168193 76.98414 26.722067 82.73733 52.97196 7.067905 65.087047
6 89.838968 49.76992 38.611409 66.84667 78.93562 9.946616 25.801678
7 94.467527 71.76185 1.339033 79.42399 2.33312 31.627171 47.854525
8 66.079779 99.19061 38.238796 10.79436 47.72301 51.863426 76.631067
9 62.911404 38.00352 86.969085 72.37109 73.23137 66.200508 8.424691
10 6.178627 77.74452 34.034900 41.12744 69.27316 40.683019 87.532133
操作的结果可能如下所示:
a b c d e f g mean_ab mean_cde mean_fg
1 26.551 20.597 93.471 48.208 82.095 47.762 91.288 23.574 74.591 69.525
2 37.212 17.656 21.214 59.957 64.706 86.121 29.360 27.434 48.626 57.741
3 57.285 68.702 65.167 49.354 78.293 43.810 45.907 62.994 64.272 44.858
4 90.821 38.410 12.556 18.622 55.304 24.480 33.239 64.616 28.827 28.860
5 20.168 76.984 26.722 82.737 52.972 7.068 65.087 48.576 54.144 36.077
6 89.839 49.770 38.611 66.847 78.936 9.947 25.802 69.804 61.465 17.874
7 94.468 71.762 1.339 79.424 2.333 31.627 47.855 83.115 27.699 39.741
8 66.080 99.191 38.239 10.794 47.723 51.863 76.631 82.635 32.252 64.247
9 62.911 38.004 86.969 72.371 73.231 66.201 8.425 50.457 77.524 37.313
10 6.179 77.745 34.035 41.127 69.273 40.683 87.532 41.962 48.145 64.108
提前感谢任何建议。