12

在数据框中,我有一列包含字符串。假设它看起来像这样:

x <- unique(df[,1])
x
"A" "A" "B" "B" "B" "C"

我想将唯一字符串的所有可能组合作为 2 个集合获得,而不关心它们的顺序,因此A, B与 相同B, A,我不想获得与组合相同的值A, A。到目前为止,我得到了这一点:

comb <- expand.grid(x, x)
comb <- comb[which(comb[,1] != comb[,2]),]

但这仍然留下了具有相同字符串组合的行以不同顺序排列的问题。我该如何摆脱这个?

4

2 回答 2

21

包里面有个combn函数utils

t(combn(LETTERS[1:3],2))
#      [,1] [,2]
# [1,] "A"  "B" 
# [2,] "A"  "C" 
# [3,] "B"  "C"

我对您的x值重复的原因感到有些困惑。

于 2012-09-03T09:35:30.553 回答
12

我想你正在寻找combn

x <- c("A", "A", "B", "B", "B", "C")
combn(x,2)

给出:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] "A"  "A"  "A"  "A"  "A"  "A"  "A"  "A"  "A"  "B"   "B"   "B"   "B"   "B"   "B"  
[2,] "A"  "B"  "B"  "B"  "C"  "B"  "B"  "B"  "C"  "B"   "B"   "C"   "B"   "C"   "C"  

如果你只想要唯一的值(如果它是调用的结果,x我不知道为什么你首先有重复的值):xunique()

> combn(unique(x),2)
     [,1] [,2] [,3]
[1,] "A"  "A"  "B" 
[2,] "B"  "C"  "C" 
于 2012-09-03T09:34:56.917 回答