r - 函数组合的方差

Question

我一直在使用utils包中的奇妙功能combn。此函数创建所有可能的组合，无需重复 Combn 描述。我将介绍我需要做的函数的一种用途，但它没有在combn函数中定义。我很想用一个很好的例子来介绍它。然而，我们想要实现的实际目标更复杂，需要更多的数据来实现。

我们想玩一个只有3 个人可以玩的游戏。不过我们是4个人。我们想知道玩游戏的所有可能组。参与者的名字是Alex、David、John和Zoe。可能的组合是：

names <- c("Alex","David","John","Zoe")
people.per.group <- 3
combn(names,people.per.group)

#Output
     [,1]    [,2]    [,3]   [,4]   
[1,] "Alex"  "Alex"  "Alex" "David"
[2,] "David" "David" "John" "John" 
[3,] "John"  "Zoe"   "Zoe"  "Zoe"

但是，我们遇到了问题，因为Alex 与 John 的关系并不好。因此，我们不想将它们都包含在 group 中。这样我们就可以为只能选择其中一个的人创建群组。

# Alex   --> Group 1
# David  --> Group 2
# John   --> Group 1
# Zoe    --> Group 3
only.one.per.group <- c(1,2,1,3)

我正在寻找一个函数，它允许我做与 combn 相同的操作，但通过变量 only.one.per.group 限制组合。命名我期待使用 var.combn 的函数，代码将是：

names <- c("Alex","David","John","Zoe")
only.one.per.group <- c(1,2,1,3)
people.per.group <- 3
var.combn(names,people.per.group,only.one.per.group)

#Output
       [,1]     [,2]      
[1,]  "Alex"   "David"
[2,]  "David"  "John" 
[3,]  "Zoe"    "Zoe"

很长一段时间以来，我一直在寻找类似的功能。所以，如果你能告诉我一个函数来做这件事，或者你想到的任何方式来做这件事，那将是非常有用的。

我希望你喜欢这个例子，我很想知道如何去做。

score 0 · Accepted Answer

Here is a different way of doing it

Find all possible group combinations:

names <- c("Alex","David","John","Zoe")
x <- expand.grid(names, names, names)

All possible groups:

    Var1  Var2  Var3
1   Alex  Alex  Alex
2  David  Alex  Alex
3   John  Alex  Alex
4    Zoe  Alex  Alex
5   Alex David  Alex
6  David David  Alex
7   John David  Alex
8    Zoe David  Alex
9   Alex  John  Alex
10 David  John  Alex
11  John  John  Alex
12   Zoe  John  Alex
13  Alex   Zoe  Alex
14 David   Zoe  Alex
15  John   Zoe  Alex
16   Zoe   Zoe  Alex
17  Alex  Alex David
18 David  Alex David
19  John  Alex David
20   Zoe  Alex David
21  Alex David David
22 David David David
23  John David David
24   Zoe David David
25  Alex  John David
26 David  John David
27  John  John David
28   Zoe  John David
29  Alex   Zoe David
30 David   Zoe David
31  John   Zoe David
32   Zoe   Zoe David
33  Alex  Alex  John
34 David  Alex  John
35  John  Alex  John
36   Zoe  Alex  John
37  Alex David  John
38 David David  John
39  John David  John
40   Zoe David  John
41  Alex  John  John
42 David  John  John
43  John  John  John
44   Zoe  John  John
45  Alex   Zoe  John
46 David   Zoe  John
47  John   Zoe  John
48   Zoe   Zoe  John
49  Alex  Alex   Zoe
50 David  Alex   Zoe
51  John  Alex   Zoe
52   Zoe  Alex   Zoe
53  Alex David   Zoe
54 David David   Zoe
55  John David   Zoe
56   Zoe David   Zoe
57  Alex  John   Zoe
58 David  John   Zoe
59  John  John   Zoe
60   Zoe  John   Zoe
61  Alex   Zoe   Zoe
62 David   Zoe   Zoe
63  John   Zoe   Zoe
64   Zoe   Zoe   Zoe

Find groups that satisfy conditions:

 x <- x[which(x[,1] != x[,2]),]
    x <- x[which(x[,1] != x[,3]),]
    x <- x[which(x[,2] != x[,3]),]
    x <- x[-which((x[,1] == "Alex" & x[,2] == "John")),]
    x <- x[-which((x[,1] == "Alex" & x[,3] == "John")),]
    x <- x[-which((x[,2] == "Alex" & x[,3] == "John")),]
    x <- x[-which((x[,2] == "Alex" & x[,1] == "John")),]
    x <- x[-which((x[,3] == "Alex" & x[,1] == "John")),]
    x <- x[-which((x[,3] == "Alex" & x[,2] == "John")),]

Result:

    Var1  Var2  Var3
8    Zoe David  Alex
14 David   Zoe  Alex
20   Zoe  Alex David
28   Zoe  John David
29  Alex   Zoe David
31  John   Zoe David
40   Zoe David  John
46 David   Zoe  John
50 David  Alex   Zoe
53  Alex David   Zoe
55  John David   Zoe
58 David  John   Zoe

score 0 · Accepted Answer

这是一个相当紧凑的解决方案，基于一个假设，即应该从每个组中选择一个人（并且这个假设可能并不总是正确 - 从 OP 中有点不清楚）。它使用了应该从每个组中选择一个人的事实，因此如果我们将名称添加到组中并使用，expand.grid()那么我们马上就有了解决方案。

DF <- data.frame(names=c("Alex","David","John","Zoe"),
                 groups=c(1, 2, 1, 3),
                 stringsAsFactors =FALSE)
expand.grid(lapply(unique(DF$groups), function(i) {DF$names[which(DF$groups==i)]}))

这产生

  Var1  Var2 Var3
1 Alex David  Zoe
2 John David  Zoe

这是您所追求的两种组合。一个稍微更紧凑的解决方案（仍然使用基础 R）将是

expand.grid(by(DF, DF$groups, function(x) x$names))

这可能更容易阅读。

它也适用于更复杂的分组：

DF <- data.frame(names=c("Alex","David","John","Zoe", "Bob", "Charles"),
                 groups=c(1, 2, 1, 3, 2, 3),
                 stringsAsFactors =FALSE)

expand.grid(by(DF, DF$groups, function(x) x$names))

产生

  Var1  Var2    Var3
1 Alex David     Zoe
2 John David     Zoe
3 Alex   Bob     Zoe
4 John   Bob     Zoe
5 Alex David Charles
6 John David Charles
7 Alex   Bob Charles
8 John   Bob Charles

现在，如果您想从每个组中选择一个小于一个，那么上面的代码应该被包装并应用于由combn().

r - 函数组合的方差

2 回答 2

Related

Reference