1

我有 3 列。C1 和 C2 按 C0 分组。现在我想提取每个 C0 组中 C1 最大时满足的 C3 值。

df = data.frame(C0 = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3),
            C1 = c(0,2,3,6,2,0,0,4,9,7,1,2,7,4,2),
            C2 = c("A","B", "C", "D", "E","A","B", "C", "D", "E","A","B", "C", "D", "E"))

现在我想添加一个新列 C4,它是 C2 的值,其中对应的 C1 在每个 C0 组中达到最大值。现在我只能提取最大 C1 的值,像这样

df %>% group_by(C0) %>% mutate (C4 = max(C1))

但是此代码返回的 C4 是每个 C0 组中 C1 的最大值的值。我不知道如何提取相应的C2值。另外,我不想只提取最大值的行,而是添加一个新列。像这样(由于我不允许附图,所以我用代码来解释这个想法:

df = data.frame(C0 = c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3),
            C1 = c(0,2,3,6,2,0,0,4,9,7,1,2,7,4,2),
            C2 = c("A","B", "C", "D", "E","A","B", "C", "D", "E","A","B", "C", "D", "E"),
            C4 = c("D","D","D","D","D","D","D","D","D","D","C","C","C","C","C"))

非常感谢你帮助我!

4

1 回答 1

0

我们可以which.max在按“C0”分组后使用来获取行索引并使用它来子集“C2”的值

library(dplyr)
df %>%
    group_by(C0) %>%
    mutate(C4 = C2[which.max(C1)])
# A tibble: 15 x 4
# Groups:   C0 [3]
#      C0    C1 C2    C4   
#   <dbl> <dbl> <fct> <fct>
# 1     1     0 A     D    
# 2     1     2 B     D    
# 3     1     3 C     D    
# 4     1     6 D     D    
# 5     1     2 E     D    
# 6     2     0 A     D    
# 7     2     0 B     D    
# 8     2     4 C     D    
# 9     2     9 D     D    
#10     2     7 E     D    
#11     3     1 A     C    
#12     3     2 B     C    
#13     3     7 C     C    
#14     3     4 D     C    
#15     3     2 E     C    
于 2020-04-27T18:39:41.550 回答