ddply
我从函数中发现了一个有趣的特性。似乎您不能在摘要输出数据框中使用与输入数据框中相同的变量名称:
set.seed(1)
ex <- data.frame(Type = c(rep("a", 10), rep("b", 10)), time = rnorm(20, 6,3))
ddply(ex, .(Type), summarize, time = mean(time), n = length(time))
Type time n
1 a 6.396608 1
2 b 6.746535 1
length
结果为 1。然而,如果您将变量名称(时间)更改为其他名称:
ddply(ex, .(Type), summarize, tim = mean(time), n = length(time))
Type tim n
1 a 6.396608 10
2 b 6.746535 10
如果我重新排列输出的列,它也会有所帮助:
ddply(ex, .(Type), summarize, n = length(time), time = mean(time))
Type n time
1 a 10 6.396608
2 b 10 6.746535
或重命名输入的变量:
set.seed(1)
ex <- data.frame(Type = c(rep("a", 10), rep("b", 10)), tim = rnorm(20, 6,3))
ddply(ex, .(Type), summarize, time = mean(tim), n = length(tim))
Type time n
1 a 6.396608 10
2 b 6.746535 10
但:
ddply(ex, .(Type), summarize, tim = mean(tim), n = length(tim))
Type tim n
1 a 6.396608 1
2 b 6.746535 1
我正在研究:
R version 3.0.0 (2013-04-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
plyr_1.8
这是plyr
R 3.0.0 之后的已知功能还是发生了什么?