1

老问题,新问题见下文

我有一个 data.frame

df<-data.frame("name"  = c("A","A","B","C"), 
               "class" = c("ab","cd","cd","ef"),
               "type"  = c("alpha","beta","gamma","delta"))

> df
  name class  type
1    A    ab alpha
3    A    ab  beta
4    B    cd gamma
5    C    ef delta

所以 nameA有两种类型alphabeta并且同时出现

我希望我的数据框看起来像这样(type列可能包含一个用逗号分隔的长字符串):

> df
  name class  type
1    A    ab alpha, beta
2    B    cd gamma
3    C    ef delta

没用的是 dcast(df, name~type)

有什么建议么?

新问题

我想name成为决定性的选择者。所以 A 有带有类型的类和带有ab类型alpha和的类。cdalphabeta

df<-data.frame("name"  = c("A","A","A","B","C"), 
               "class" = c("ab","cd","cd","cd","ef"),
               "type"  = c("alpha","alpha","beta","gamma","delta"))

> df
  name class  type
1    A    ab alpha
2    A    cd alpha
3    A    cd  beta
4    B    cd gamma
5    C    ef delta

dplyr::summarise(var = paste(type, collapse = ", "))`(见下面的解决方案)返回

> df
  name var
1    A alpha, alpha, beta
2    B gamma
3    C delta

这会导致第一行出现双倍alpha。我正在寻找去除这个双峰的可能性。目标:

> df
  name var
1    A alpha, beta
2    B gamma
3    C delta

编辑:

由 Gregor 解决,见评论

4

1 回答 1

3

尝试这个。我们按名称和类别分组,然后用逗号折叠:

library(dplyr)

df %>%
  group_by(name, class) %>%
  summarise(type = paste(type, collapse = ","))
#> # A tibble: 3 x 3
#> # Groups:   name [?]
#>   name  class type      
#>   <fct> <fct> <chr>     
#> 1 A     ab    alpha,beta
#> 2 B     cd    gamma     
#> 3 C     ef    delta

reprex 包(v0.2.0)于 2018 年 9 月 25 日创建。

于 2018-09-25T13:11:26.877 回答