python - 为特定的精确覆盖问题定义约束

Question

我有一个特定的问题，我想使用Knuth's Algorithm X来解决。但是，我很难将我的问题转化为合适的约束，这些约束构成了算法 X 运行的关联矩阵。

对于我的体育俱乐部夏季锦标赛，我想制定一个时间表，将四名球员组合在一起，而无需在随后的比赛中重新组合任何一对球员。

我认为这可以很好地转化为这样的精确覆盖问题：

所有玩家都表示为矩阵的一列
每组四名玩家（不管他们的顺序）是矩阵中的一行。
因此，当玩家是该组的成员时，将 1 写入矩阵单元格。

在为 20 个玩家设置后，我得到了一个包含 20 列和 4845 行（每个玩家/列 969 行）的关联矩阵。

算法 X 会很好地找到一个解决方案，但这只会涵盖一个（第一）轮。让算法继续下去会为同一轮吐出更多替代解决方案，这对我来说并不感兴趣。所以我围绕算法构建了一个迭代器，它将获取解决方案并根据玩家重叠从关联矩阵中删除行：每当解决方案中的一个组与矩阵的任何行有至少 2 的交集时，该行就会被删除. 在算法第一次运行后，矩阵被剔除到 1280 行。运行算法 X 将找到下一个解决方案，等等，直到它不再存在。

长话短说，这种方法不是精确覆盖问题的正确应用——我必须反复寻找部分解决方案。正确的精确覆盖问题应该包括以某种方式进行的回合顺序。为什么？因为现在我没有探索所有可能的解决方案！20 名球员就是最好的例子。算法 X 将只找到 3 个连续轮次的解决方案。Yet, I do know that there are at least 5, when different intermediate solutions are chosen. 这正是我希望算法 X 能够为我解决的工作。使用上述方法，游戏轮次之间没有回溯。

尽管这个问题足够抽象以至于不需要代码，但这是我在 Python 中实现的 Knuth 的 DLX（带有Dancing Links的算法 X）：

from itertools import combinations
def dancing_links (players):
    """
    Implement the dancing links algorithm as described by Donald Knuth, which
    attemts to solve an exact cover problem exhaustively in an efficient way.
    Adapted for my use case, I define the incidence matrix as follows:
        * Columns are players.
        * Rows are groups of players.
        * The intersection of groups and players mean that that player is a
          member of that group.
        * One group contains exactly four players, i.e. each row has
          exactly four 1s.
        * Repeatedly solve the exact cover problem for a reduced set of groups,
          where each round the total set of groups is filtered for illegal
          groups. An illegal group features at least two players that
          have already played together in a round.
    """

    class FoundSolution (Exception):
        "Use the exception to abort recursive stacks"
        pass

    # Dancing links is based on "doubly linked lists" intersecting
    # with each other. Python doesn't have this kind of data structure
    # and implementing it is quite expensive. E.g. each field of the incidence
    # matrix could be a Node which has pointers in all four directions,
    # The Node class with 6 attributes (four pointers, a name and arbitrary
    # data) needs to undergo countless name lookups, which is a slow process
    # in Python. So instead, I represent each node without a class definition
    # as a dict.
    # 
    # Since we're walking over so many doubly linked lists, starting from
    # any of its nodes, we need to remember where we started and iterate
    # through all of them. That clutters our code later on a lot without
    # this iterator function.
    def iter_dll (start, direction='right'):
        next = start[direction]
        # Need to explicitly compare object ids. Otherwise Python
        # would try to do a deep comparison of two dicts. which is impossible
        # due to the circular referencing.
        while id(start) != id(next):
            yield next
            next = next[direction]

    def cover (column):
        """
        Cover a column by removing its head node from the control row and
        removing each of its rows from other columns that intersect.
        """
        column['left']['right'] = column['right']
        column['right']['left'] = column['left']
        for r in iter_dll(column, 'down'):
            for c in iter_dll(r):
                c['up']['down'] = c['down']
                c['down']['up'] = c['up']

    def uncover (column):
        # Undo the changes caused by a call to cover(dll) by injecting the
        # linked nodes with the remembered predecessor and successor.
        for r in iter_dll(column, 'up'):
            for c in iter_dll(r, 'left'):
                c['up']['down'] = c['down']['up'] = c
        else:
            column['left']['right'] = column['right']['left'] = column

    def search (i, root):
        if id(root['right']) == id(root):
            # The only way to exit the complete recursion stack is an exception.
            raise FoundSolution
        for c in iter_dll(root):
            cover(c)
            for r in iter_dll(c, 'down'):
                lineup.append(r)
                for j in iter_dll(r):
                    cover(j['data'])
                search(i+1, root)
                lineup.pop()
                for j in iter_dll(r, 'left'):
                    uncover(j['data'])
            else:
                uncover(c)

    def generate_incidence_matrix (groups):
        # The gateway to our web of doubly linked lists.
        root = {'name': 'root', 'data': None}
        # Close the circle in left and right dirctions, so we can keep the
        # circle closed while injecting new nodes.
        root['right'] = root['left'] = root
        # The control row of column headers is created by attaching each new
        # Header with the previous one.
        for player in players:
            n = {
                'name': 'Headnode {}'.format(player),
                'data': player,
                'right': root,
                'left': root['left'],
            }
            n['up'] = n['down'] = n
            root['left']['right'] = root['left'] = n

        # Now append nodes to each column header node in our control row -
        # one for each player of a distinct group of four players.
        rnmbr = 0
        # Seed for new row nodes
        seed = {'name': 'seed', 'data': None}
        for g in groups:
            rnmbr += 1
            seed['right'] = seed['left'] = seed
            # Iterate through the control nodes for each of the four players.
            for header in (m for m in iter_dll(root) for p in g if m['data'] == p):
                n = {
                    # Chose a name that identifies the row and colum for this
                    # new node properly.
                    'name': 'R-{},C-{}'.format(rnmbr, header['data']),
                    'data': header,
                    'up': header['up'],
                    'down': header,
                    'left': seed,
                    'right': seed['right']
                }
                header['up']['down'] = header['up'] = n
                seed['right']['left'] = seed['right'] = n
            else:
                # Extract the seed from this row
                seed['right']['left'] = seed['left']
                seed['left']['right'] = seed['right']

        return root

    groups = tuple(combinations(players, 4))    
    groups_per_round = len(players)/4
    lineups = []

    while len(groups) >= groups_per_round:
        root = generate_incidence_matrix(groups)
        lineup = []
        try:
            search(0, root)
        except FoundSolution:
            lineup = reduce(list.__add__, ([r['data']['data']] + [n['data']['data'] for n in iter_dll(r)] for r in lineup))
            lineup = tuple(tuple(sorted(lineup[i:i + 4])) for i in xrange(0, len(lineup), 4))
            lineups.append(lineup)
            groups = tuple(group for group in groups if all(len(g.intersection(set(group))) < 2 for g in (set(s) for s in lineup))) 
        else:
            break

    return lineups

给定玩家列表，此功能将打印中间解决方案到屏幕，直到选项用尽。可悲的是，它没有我希望的那么快，但对我来说这是一个很好的编程练习。:-)

调用dancing_links()上面定义的函数将产生以下输出...

>>> pprint.pprint(dancing_links(range(1,21)))
[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
 ((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
 ((1, 6, 11, 14), (2, 5, 12, 18), (3, 8, 13, 19), (4, 9, 16, 17), (7, 10, 15, 20))]

我所期待的更像是……

[((1, 2, 3, 4), (5, 6, 7, 8), (9, 10, 11, 12), (13, 14, 15, 16), (17, 18, 19, 20)),
 ((1, 5, 9, 13), (2, 6, 10, 17), (3, 7, 14, 18), (4, 11, 15, 19), (8, 12, 16, 20)),
 ((1, 12, 15, 18), (2, 5, 16, 19), (3, 6, 9, 20), (4, 7, 10, 13), (8, 11, 14, 17)),
 ((1, 7, 11, 20), (2, 8, 13, 18), (3, 5, 10, 15), (4, 9, 16, 17), (6, 12, 14, 19)),
 ((1, 8, 10, 19), (2, 7, 9, 15), (3, 12, 13, 17), (4, 5, 14, 20), (6, 11, 16, 18))]

请注意，它不一定是这个确切的解决方案。这只是我在尝试最终为任意数量的玩家生成时间表时发现的一个示例解决方案。

score 0 · Accepted Answer

（完全重写了答案。）

尽管有可能将此问题转换为精确覆盖问题并（原则上）使用算法 X 解决它，但在实践中证明是不切实际的并且有更好的方法。

首先回答你的问题：回想一下确切的覆盖问题：给定一堆“项目”（被覆盖）和一堆“选项”（每个选项都覆盖一组项目），问题是找到一个一组选项，使得每个项目都只被覆盖一次。

在您的情况下，如果您选择 (20) 名玩家作为项目并选择四人组作为选项，那么，正如您所发现的，任何针对精确掩护问题的算法都会找到安排一轮锦标赛的方法。

但实际上您根本不需要，因为（在没有进一步限制的情况下）您可以明确写下所有解决方案：有 (20 选择 4)=4845 种选择第一组 4 的方法，然后(16 选择 4)=1820 种方式从剩下的选择另外一个 4 组，以此类推，最后你不关心你找到的 5 个 4 组之间的顺序，所以方式的数量您可以将一组 20 人分成五组，每组 4 人是

(选择(20, 4) * 选择(16, 4) * 选择(12, 4) * 选择(8, 4) * 选择(4, 4)) / (5 * 4 * 3 * 2 * 1) = 2546168625 .

（等价于：(20!)/((4!)^5 * 5!) = 2546168625，因为我们可以写下 20 名玩家的列表，然后在每组 4 人中重新排序并重新排序。）

如果您想用程序生成所有这些，您可以按规范顺序编写它们中的每一个：假设您将 20 个玩家称为“A”到“T”，那么您可以按字典顺序编写每个 4 组顺序（因此，组 {F, J, B, Q} 将写作“BFJQ”），最后你可以按字典顺序写下五个组本身（因此第一组将以“A”开头，第二组最早的字母不在第一组的组，等等）。

接下来，如果你想覆盖多轮，再次将其框架为一个精确覆盖问题，你需要有上述数量（≈25 亿）的“选项”（行）。目前尚不清楚这些项目将是什么，但这显然是不切实际的，因此不值得追求这种思路。

相反，事实证明您的问题已经得到充分研究，最初以柯克曼的女学生问题的名义（安排，从 15 人开始，尽可能多地 3 人一组 - 结果是 7）（请参阅此处的 Ray Toal 页面），以及最近的“社交高尔夫球手问题”（请参阅此处的 Markus Triska 页面）：

社交高尔夫球手问题 (SGP) 是一个组合优化问题。任务是将g × p的高尔夫球手安排在p组的g组中，为期w周，这样没有两个高尔夫球手在同一组中打球超过一次。SGP 的一个实例由三元组gpw表示。原始问题要求最大 w 以便可以解决实例8-4-w 。

您的问题在这里要求最大w以便可以解决实例5-4-w 。答案原来是 w=5，问题上的MathWorld 页面正好给出了这个 5-4-5 实例：

Mon ABCD    EFGH    IJKL    MNOP    QRST
Tue AEIM    BJOQ    CHNT    DGLS    FKPR
Wed AGKO    BIPT    CFMS    DHJR    ELNQ
Thu AHLP    BKNS    CEOR    DFIQ    GJMT
Fri AFJN    BLMR    CGPQ    DEKT    HIOS

另请参阅Warwick Harvey 的页面，该页面具有（拥有）许多不同参数的最知名的解决方案。它记录了 5 是这个问题的上限和下限，即知道如何安排 5 轮（你自己找到了一个！）并且知道不可能安排超过 5 轮。

您可以在文献中搜索“社交高尔夫球手问题”，以找到更多通过计算机解决该问题的方法。更一般地说，此类问题在称为组合设计的广泛组合学领域中进行了研究。我在编写较早的答案时发现的一个很好的参考是《组合设计手册》一书，其中包含 VI.3 平衡比赛设计和 VI.51 安排比赛等章节。

python - 为特定的精确覆盖问题定义约束

1 回答 1

Related

Reference