12

我希望能够获取一系列数字并返回一个包含三元组且不重复的列表。x 的每个元素应该在三元组的每个位置出现一次。目标是获得类似以下内容:

get_combinations_without_duplicates(3) = [(0, 1, 2), (1, 2, 0), (2, 0, 1)]

对于 range(3) 这只是一个列表轮换,但对于更高的范围,有更多可能的组合。我希望能够随机生成满足这些约束的三元组列表。

假设我们首先为 n=4 的情况指定每个三元组的第一个元素:

[(0,), (1,), (2,), (3,)]

第一个三元组的第二个元素可以是 0 以外的任何值。一旦选择其中一个,就会限制下一个三元组的选项,依此类推。目标是有一个函数,它接受一个数字并以这种方式创建三元组,但并不总是创建相同的三元组。也就是说,最终结果可能是轮换:

[(0, 1, 2), (1, 2, 3), (2, 3, 0), (3, 0, 1),]

或者

[(0, 2, 3), (1, 3, 0), (2, 0, 1), (3, 1, 2)]

下面是这个函数的一个实现:

def get_combinations_without_duplicates(n):
    output = []
    second = range(n)
     third = range(n)
for i in range(n):
    triple = [i]
    #Get the second value of the triple, but make sure that it isn't a 
    #duplicate of the first value
    #in the triple or any value that has appeared in the second position of any triple
    choices_for_second = [number for number in second if number not in triple]
    #Randomly select a number from the allowed possibilities
    n_second = random.choice(choices_for_second) 
    #Append it to the triple
    triple.append(n_second)
    #Remove that value from second so that it won't be chosen for other triples
    second = [number for number in second if number != n_second]
    #Do the same for the third value
    choices_for_third = [number for number in third if number not in triple]
    n_third = random.choice(choices_for_third)
    triple.append(n_third)
    third = [number for number in third if number != n_third]
    output.append(tuple(triple))
return output

如下所述,此过程有时会随机选择不起作用的组合。如果您执行以下操作,则可以处理:

def keep_trying(n):
    try:
        return get_combinations_without_duplicates(n)
    except IndexError:
        return keep_trying(n)

但是,我想知道是否有更好的方法来做到这一点。

4

6 回答 6

5

Let's try this again.

A few observations.

  1. The first value will always be zero in a sorted array of your tuples.
  2. The length of the array will always be as long as the number of tuples that exist in your array.
  3. You want these to be randomly generated.
  4. The tuples are produced in 'sorted' order.

Based on these specifications, we can come up with a procedural method;

  1. Generate 2 lists of serial integers, one to pick from, the other to seed from.
  2. For each number in the seed list, [0, 1, 2, 3], randomly append and remove a number that's not already in the element. [01, 13, 20, 32]
  3. Generate another list of serial integers, and repeat. [012, 130, 203, 321]

But, this doesn't work. For some iterations, it will back itself into a corner and not be able to generate a number. For instance, [01, 13, 20, 32].. appending [3, 0, 1... crap, I'm stuck.

The only way to fix this, is to do a true shuffling over the entire row, and reshuffle until one fits. This may take quite a bit of time, and will only get more painful as the sets get longer.

So, procedurally speaking:

Solution 1: Random generation

  1. Populate a list with your range. [0, 1, 2, 3]
  2. Create another list. [0, 1, 2, 3]
  3. Shuffle the list. [1, 0, 2, 3]
  4. Shuffle until you find one that fits... [1, 2, 3, 0]
  5. Repeat with the third element.

With this procedure, while a computer can verify solutions very quickly, it cannot generate solutions very quickly. However, it is merely one of two ways to generate a truly random answer.

Therefore, the fastest guaranteed method would use make use of a verification procedure, rather than a generating procedure. First things first, generate all the possible permutations.

from itertools import permutations

n = 4
candidates = [i for i in permutations(xrange(n),3)]

Then.

Solution 2: Random verification

  1. Pick a triplet that starts with 0.
  2. Pop, at random, a triplet that does not start with 0.
  3. Verify if the randomly picked triplet is an intermediate solution.
  4. If not, pop another triplet.
  5. If yes, append the triplet, and REPOPULATE THE TRIPLET QUEUE.
  6. Repeat n times. # or until you exhaust the queue, at which point repeat n times naturally becomes TRUE

A solution for the next triplet is mathematically guaranteed to be in the solution set, so if you just let it exhaust itself, a randomized solution should appear. The problem with this approach is that there's no guarantee that every possible outcome has an equal probability.

Solution 3: Iterative verification

For equal probability results, get rid of the randomization, and generate every possible 3-tuple combination, n-lists long-- and verify each of those solution candidates.

Write a function to verify over the list of candidate solutions to produce every solution, and then randomly pop a solution from that list.

from itertools import combinations

results = [verify(i) for i in combinations(candidates, n)]
# this is 10626 calls to verify for n=4, 5 million for n=5 
# this is not an acceptable solution.  

Neither Solution 1 or 3 is very fast, O(n**2), but given your criteria, it's possible this is as fast as it'll get if you want a truly random solution. Solution 2 will guaranteed be the fastest of these three, often times significantly beating 1 or 3, Solution 3 has the most stable results. Which of these approaches you choose will depend on what you want to do with the output.

Afterward:

Ultimately, the speed of the code will be contingent on exactly how random you want your code to be. An algorithm to spit out the VERY first (and only the very first) instance of a tuple series that satisfies your requirement can run supremely quickly, as it just attacks the permutations in order, once, and it will run in O(n) time. However, it will not do anything randomly...

Also, here's some quick code for verify(i). It's based on the observation that two tuples may not have the same number in the same index.

def verify(t):
    """ Verifies that a set of tuples satisfies the combinations without duplicates condition. """
    zipt = zip(*t)
    return all([len(i) == len(set(i)) for i in zipt])

n = 4 Full Solution Set

((0, 1, 2), (1, 0, 3), (2, 3, 0), (3, 2, 1))
((0, 1, 2), (1, 0, 3), (2, 3, 1), (3, 2, 0))
((0, 1, 2), (1, 2, 3), (2, 3, 0), (3, 0, 1))
((0, 1, 2), (1, 3, 0), (2, 0, 3), (3, 2, 1))
((0, 1, 3), (1, 0, 2), (2, 3, 0), (3, 2, 1))
((0, 1, 3), (1, 0, 2), (2, 3, 1), (3, 2, 0))
((0, 1, 3), (1, 2, 0), (2, 3, 1), (3, 0, 2))
((0, 1, 3), (1, 3, 2), (2, 0, 1), (3, 2, 0))
((0, 2, 1), (1, 0, 3), (2, 3, 0), (3, 1, 2))
((0, 2, 1), (1, 3, 0), (2, 0, 3), (3, 1, 2))
((0, 2, 1), (1, 3, 0), (2, 1, 3), (3, 0, 2))
((0, 2, 1), (1, 3, 2), (2, 0, 3), (3, 1, 0))
((0, 2, 3), (1, 0, 2), (2, 3, 1), (3, 1, 0))
((0, 2, 3), (1, 3, 0), (2, 0, 1), (3, 1, 2))
((0, 2, 3), (1, 3, 2), (2, 0, 1), (3, 1, 0))
((0, 2, 3), (1, 3, 2), (2, 1, 0), (3, 0, 1))
((0, 3, 1), (1, 0, 2), (2, 1, 3), (3, 2, 0))
((0, 3, 1), (1, 2, 0), (2, 0, 3), (3, 1, 2))
((0, 3, 1), (1, 2, 0), (2, 1, 3), (3, 0, 2))
((0, 3, 1), (1, 2, 3), (2, 1, 0), (3, 0, 2))
((0, 3, 2), (1, 0, 3), (2, 1, 0), (3, 2, 1))
((0, 3, 2), (1, 2, 0), (2, 1, 3), (3, 0, 1))
((0, 3, 2), (1, 2, 3), (2, 0, 1), (3, 1, 0))
((0, 3, 2), (1, 2, 3), (2, 1, 0), (3, 0, 1))

n = 5 has 552 unique solutions. Here are the first 20.

((0, 1, 2), (1, 0, 3), (2, 3, 4), (3, 4, 0), (4, 2, 1))
((0, 1, 2), (1, 0, 3), (2, 3, 4), (3, 4, 1), (4, 2, 0))
((0, 1, 2), (1, 0, 3), (2, 4, 0), (3, 2, 4), (4, 3, 1))
((0, 1, 2), (1, 0, 3), (2, 4, 1), (3, 2, 4), (4, 3, 0))
((0, 1, 2), (1, 0, 4), (2, 3, 0), (3, 4, 1), (4, 2, 3))
((0, 1, 2), (1, 0, 4), (2, 3, 1), (3, 4, 0), (4, 2, 3))
((0, 1, 2), (1, 0, 4), (2, 4, 3), (3, 2, 0), (4, 3, 1))
((0, 1, 2), (1, 0, 4), (2, 4, 3), (3, 2, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 0), (2, 3, 4), (3, 4, 1), (4, 0, 3))
((0, 1, 2), (1, 2, 0), (2, 4, 3), (3, 0, 4), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 0, 4), (3, 4, 0), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 0, 4), (3, 4, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 3), (2, 3, 4), (3, 4, 0), (4, 0, 1))
((0, 1, 2), (1, 2, 3), (2, 4, 0), (3, 0, 4), (4, 3, 1))
((0, 1, 2), (1, 2, 3), (2, 4, 1), (3, 0, 4), (4, 3, 0))
((0, 1, 2), (1, 2, 4), (2, 0, 3), (3, 4, 0), (4, 3, 1))
((0, 1, 2), (1, 2, 4), (2, 0, 3), (3, 4, 1), (4, 3, 0))
((0, 1, 2), (1, 2, 4), (2, 3, 0), (3, 4, 1), (4, 0, 3))
((0, 1, 2), (1, 2, 4), (2, 3, 1), (3, 4, 0), (4, 0, 3))
((0, 1, 2), (1, 2, 4), (2, 4, 3), (3, 0, 1), (4, 3, 0))

So, you can generate solutions like this, but it takes time. If you were going to utilize this, I would cache the solutions generated as is, and then randomly pull from them when you need them for whatever number n. Incidentally, n=5 took a little under a minute to complete, brute force. Since the solution is O(n**2), I expect n=6 to take over an hour, n=7 over a day. The only way you can get return a true randomized solution is by doing it this way.

Edited: Random Solution without equal distribution:

The following is code I wrote in attempting to solve this problem, an implementation of Solution 2. I figured I would post it, since it is a partial, non-equal distribution solution, and generates every possible solution, guaranteed, given enough time.

def seeder(n):
    """ Randomly generates the first element in a solution. """
    seed = [0]
    a = range(1, n)
    for i in range(1, 3):
        seed.append(a.pop(random.randint(0,len(a)-1)))
    return [seed]

def extend_seed(seed, n):
    """ Semi-Randomly generates the next element in a solution. """
    next_seed = [seed[-1][0] + 1]
    candidates = range(0, n)
    for i in range(1, 3):
        c = candidates[:]
        for s in next_seed:
            c.remove(s)
        for s in seed:
            try:
                c.remove(s[i])
            except ValueError:
                pass
        next_seed.append(c.pop(random.randint(0,len(c)-1)))
    seed.append(next_seed)
    return seed

def combo(n):
    """ Extends seed until exhausted. 
    Some random generations return results shorter than n. """
    seed = seeder(n)
    while True:
        try:
            extend_seed(seed, n)
        except (ValueError, IndexError):
            return seed

def combos(n):
    """ Ensures that the final return is of length n. """
    while True:
        result = combo(n)
        if len(result) == n:
            return result
于 2012-10-23T17:53:58.187 回答
2

您本质上想要一个拉丁方格,anxn 数字网格,其中每行和每列仅包含每个数字一次,除非您只关心每行中的前三个数字(拉丁矩形)。

更新:

我已经删除了无效的代码示例,因为生成具有相等概率的随机拉丁方格并非易事,正如math.stackexchange.com 上的问题中所讨论的那样。

SAGE 项目实现了该问题中提到的算法,因此您可以查看代码以获得灵感。

或者,如果您真的想了解详细信息,请查看本文以了解生成随机拉丁矩形的具体情况。

于 2012-10-23T23:24:41.310 回答
1

实际上 itertools 已经为你解决了这个问题。

import itertools

allp = [x for x in itertools.permutations(range(3))]
print allp

mylist = ['A','B','C']
allp2 = [x for x in itertools.permutations(mylist)]
print allp2

输出

[(0, 1, 2), (0, 2, 1), (1, 0, 2), (1, 2, 0), (2, 0, 1), (2, 1, 0)]
[('A', 'B', 'C'), ('A', 'C', 'B'), ('B', 'A', 'C'), ('B', 'C', 'A'), ('C', 'A', 'B'), ('C', 'B', 'A')]
于 2012-10-23T18:11:57.737 回答
0

只是对你的问题有不同的看法。看看这是否适合你

>>> from itertools import chain,combinations
>>> def get_combinations_without_duplicates(iterable):
        return (tuple(chain(*(set(iterable) - set(e) , e))) for e in combinations(iterable,2))

>>> list(get_combinations_without_duplicates(range(3)))
[(2, 0, 1), (1, 0, 2), (0, 1, 2)]
于 2012-10-23T17:57:29.187 回答
0

一个简单的列表轮换为所有 n >= 3 提供了一个正确的解决方案:

考虑 n = 5 的旋转解:

[
    (0, 1, 2),
    (1, 2, 3),
    (2, 3, 4),
    (3, 4, 0),
    (4, 0, 1)
]

每个数字仅在每个位置出现一次,并且对于每个位置,所有数字都存在。


一般来说,len(get_combinations_without_duplicates(n)) == n对于 n >= 3

于 2012-10-23T19:04:28.993 回答
0

这是一种利用 deque.rotate 的方法

>>> datax = []
>>> from collections import deque
>>> data_length = 10
>>> subset_length = 3
>>> for i in xrange(0, subset_length):
...     datax.append(deque(xrange(data_length)))
...
>>> for i in xrange(0, subset_length):
...     datax[i].rotate(i)
...
>>> print zip(*datax)
[(0, 9, 8), (1, 0, 9), (2, 1, 0), (3, 2, 1), (4, 3, 2), (5, 4, 3), (6, 5, 4), (7, 6, 5), (8, 7, 6), (9, 8, 7)]
于 2012-10-23T22:46:24.303 回答