我有包含五个变量的数据框。其中两个是公制测量,其中三个包含存储为因子的组。我尝试通过不同的组在一个循环中将该数据帧子集三次,并计算每个组的每个度量测量的平均值。结果可以存储为新列表中的新数据框。现在我使用subset
和ldply
从plyr
包中。单个子集没有问题,但是当我尝试将循环的结果存储在向量中时,我收到一条警告消息,指出number of items to replace is not a multiple of replacement length
. 可以在下面找到示例代码。任何帮助将非常感激!
df<-data.frame(a=c(1:5),b=c(21:25),group1=c("a","b","a","a","b"),group2=c("b","a","c","b","c"),group3=c("a","b","c","d","c"))
# single subset
llply(subset(df,group1=="a")[1:2],mean)
# subset for all groups
# create grouplist
grouplist<-colnames(df[3:5])
# create vector to store results
output.vector<-vector()
# create loop
for (i in grouplist)output.vector[i]<-ldply(subset(df,grouplist=="a")[1:2],mean)
output.vector
Warning messages:
1: In output.vector[i] <- ldply(subset(df, grouplist == "a")[1:2], :
number of items to replace is not a multiple of replacement length
因此列表中一项的输出如下所示:
output.vector$group1
|a| | b|
|a| |2.67| |3.5|
|b| |22.7| |23.5|
output.vector$group2
|a| | b| |c|
|a| |2| |2.5| |4|
|b| |22| |22.5| |24|
output.vector$group3
|a| |b| |c| |d|
|a| |1| |2| |4| |4|
|b| |21| |22| |24| |14|