1

我有一个这样的数据框:

data=data.frame(ID=c("0001","0002","0003","0004","0004","0004","0001","0001","0002","0003"),Saldo=c(10,10,10,15,20,50,100,80,10,10),place=c("grocery","market","market","cars","market","market","cars","grocery","cars","cars"))

我试图计算 ID 变量中每个人的 aldo 总和,应用 cumsum 或 apply 但我没有得到我想要的结果。我想要这样的人:

  ID      Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85 
4

2 回答 2

5

您可以使用aggregate

> aggregate(Saldo ~ ID, data, function(x) max(cumsum(x))) ## same as sum
    ID Saldo
1 0001   190
2 0002    20
3 0003    20
4 0004    85

如果您真的对 ID 的累积总和感兴趣,请尝试以下操作:

within(data, {
  Saldo.Total <- ave(Saldo, ID, FUN = cumsum)
})
#     ID Saldo   place Saldo.Total
# 1  0001    10 grocery          10
# 2  0002    10  market          10
# 3  0003    10  market          10
# 4  0004    15    cars          15
# 5  0004    20  market          35
# 6  0004    50  market          85
# 7  0001   100    cars         110
# 8  0001    80 grocery         190
# 9  0002    10    cars          20
# 10 0003    10    cars          20
于 2013-03-14T02:31:51.413 回答
1

我想你可能会感到困惑,因为你想要的并不是一个累积的总和,它只是一个总和:

library(plyr)
ddply(
  data,
  .(ID),
  summarize,
  Saldo.Total=sum(Saldo)
  )

输出:

    ID Saldo.Total
1 0001         190
2 0002          20
3 0003          20
4 0004          85

当您沿着向量移动时,累积总和是“运行总数”,例如:

> x = c(1, 2, 3, 4, 5)
> cumsum(x)
[1]  1  3  6 10 15
于 2013-03-14T02:31:51.030 回答