0

我想通过按期间分组第 1 组和按付款人 ID 分组第 2 组来总结此数据集,以便按月将结果显示为任何给定用户的小计,如下所示:

数据框:

Payer   Period
1   10  1-1015
2   15  2-1015
3   14  3-1015
1   1   1-1015
3   5   1-1015
1   7   4-1015
3   8   4-1015
1   4   5-1015

结果应如下所示:

Payer   Period
1   11  1-1015
3   5   1-1015
2   15  2-1015
3   14  3-1015
1   7   4-1015
3   8   4-1015
1   4   5-1015

最好的方法是什么?谢谢!

4

2 回答 2

4

您可以aggregate假设有三列。

 aggregate(Amount~., df1, FUN=sum)
 #    Payer Period Amount
 #1     1 1-1015     11
 #2     3 1-1015      5
 #3     2 2-1015     15
 #4     3 3-1015     14
 #5     1 4-1015      7
 #6     3 4-1015      8
 #7     1 5-1015      4

或者

 library(data.table)#v1.9.5+
 setDT(df1)[, list(Amount=sum(Amount)), .(Period, Payer)]
 #    Period Payer Amount
 #1: 1-1015     1     11
 #2: 2-1015     2     15
 #3: 3-1015     3     14
 #4: 1-1015     3      5
 #5: 4-1015     1      7
 #6: 4-1015     3      8
 #7: 5-1015     1      4

使用不同的顺序

 aggregate(Amount~., df2, FUN=sum)
 #  Payer Period Amount
 #1     1 1-1015     11
 #2     3 1-1015      5
 #3     2 2-1015     15
 #4     3 3-1015     14
 #5     1 4-1015      7
 #6     3 4-1015      8
 #7     1 5-1015      4

数据

 df1 <- structure(list(Payer = c(1L, 2L, 3L, 1L, 3L, 1L, 3L, 1L), 
 Amount = c(10L, 
 15L, 14L, 1L, 5L, 7L, 8L, 4L), Period = c("1-1015", "2-1015", 
 "3-1015", "1-1015", "1-1015", "4-1015", "4-1015", "5-1015")),
 .Names = c("Payer", 
  "Amount", "Period"), class = "data.frame", row.names = c(NA, -8L))

 set.seed(24)
 df2 <- df1[sample(nrow(df1)),]
于 2015-07-12T19:03:38.580 回答
1
require(dplyr)
df %>% group_by(Period,Payer) %>%
    summarize(Amount = sum(Amount)) %>%
    ungroup() # this should ungroup by the last grouped var, i.e. Payer

# if that doesn't work, then add an explicit %>% arrange(Period, Payer)
于 2015-07-12T19:05:27.453 回答