python - 查找具有最少元素数的列表的所有无序分区

Question

给出了一个元素列表。我希望有可能将此列表划分为任意数量的分区，以便每个分区至少有 x 个元素。列表中分区的顺序和分区中元素的顺序无关紧要。例如： List = [1,2,3,4] get_partitions(list,2) 应该返回：

[[1,2,3,4],
 [[1,2],[3,4]],
 [[1,3],[2,4]],
 [[1,4],[2,3]]]

List = [1,2,3,4] get_partitions(list,1) 应该返回：

[[1,2,3,4],
 [[1,2],[3,4]],
 [[1,3],[2,4]],
 [[1,4],[2,3]],
 [[1],[2],[3,4],
  ...]

我已经开始在 Python 中递归地实现这个，但是我创建了太多的冗余案例。例如，出于运行时的原因，我想提前减少这些情况，而不是在之后使用 freezesets 删除它们。

from itertools import combinations
import numpy as np
def get_partitions(liste,min,max=None):

    if max is None:
        # Default setting
        max = len(liste)
    if len(liste) == min :
        # Termination Criterium
        yield [liste]
    else:
        for r in range(np.min([len(liste),max]),min-1,-1):
            # max for avoiding cases like: [[1,2,3,4],[2,6]] and [[2,6],[1,2,3,4]]
            for perm in combinations(liste,r):
                rest = [i for i in liste if i not in perm]
                if len(rest) >= min:
                    for recurse in get_partitions(rest,min,r):
                        yield [list(perm)] + list(recurse)
                if len(rest) == 0:
                    # r == len(liste)
                    yield [list(perm)]

这将导致：

[[[1, 2, 3, 4]],
 [[1, 2], [3, 4]],
 [[1, 3], [2, 4]], 
 [[1, 4], [2, 3]],
 [[2, 3], [1, 4]],
 [[2, 4], [1, 3]],
 [[3, 4], [1, 2]]]

提前感谢您的帮助。

尝试使用@mozway 的答案并将其扩展到递归版本导致我：

def get_partitions(iterable, minl=2):
    s = set(iterable)

    for r in range(minl, len(s)//2+1):
        if len(s)//2 != r:
            for c in combinations(s, r):
                for recurse in get_partitions(list(s.difference(c)), minl):
                    yield [list(c),*recurse]
        else:
            for c in islice(combinations(s, r), comb(len(s),r)//2):
                for recurse in get_partitions(list(s.difference(c)), minl):
                    yield [list(c),*recurse]

    yield [list(s)]

对于示例 list = [1,2,3,4], x=1 它正在将可能性的数量从 47（我最初的尝试）减少到 19。仍然有很多多余的情况。

[[[1], [2], [3], [4]], <----
 [[1], [2], [3, 4]],
 [[1], [2, 3, 4]],
 [[2], [1], [3], [4]], <----
 [[2], [1], [3, 4]],
 [[2], [1, 3, 4]],
 [[3], [1], [2], [4]], <----
 [[3], [1], [2, 4]],
 [[3], [1, 2, 4]],
 [[4], [1], [2], [3]], <----
 [[4], [1], [2, 3]],
 [[4], [1, 2, 3]],
 [[1, 2], [3], [4]],
 [[1, 2], [3, 4]],
 [[1, 3], [2], [4]],
 [[1, 3], [2, 4]],
 [[1, 4], [2], [3]],
 [[1, 4], [2, 3]],
 [[1, 2, 3, 4]]]

score 1 · Accepted Answer

这是一个长期的解决方案。在生成分区时不使用拒绝，因此从这个意义上说，这可能有点有效。不过，还有很多东西需要优化。

例子：

list(get_partitions(range(3), 1))
# [[[0, 1, 2]], [[0], [1, 2]], [[1], [0, 2]], [[2], [0, 1]], [[0], [1], [2]]]

以下是其工作原理的概述：

split首先，定义一个辅助函数很有用，它接受一个列表lst和一个整数n，并返回将列表分成两组的所有方法，一组是 size n，另一组是 size len(lst) - n。
其次，我们需要解决这个问题的一个更简单的版本：如何将一个列表lst分成大小为n每个大小的组k。当然，这只有在时才有可能len(lst) = n * k。这是在get_partitions_same_size函数中实现的。这个想法是始终lst在第一组中包含第一个元素并递归。
第三，我们需要找到的所有整数分区len(lst)。我从这个线程复制了代码。
第四，对于的每一个整数分区p，len(lst)我们需要根据找到所有的分区lst 方式 p。
- 例如，说len(lst) == 7和p = 3 + 2 + 2。在这种情况下，我们可以为第一组选择任何三个元素，为第二组选择任何剩余的两个，而对于最后的第三组则没有选择。
- 这可能会引入冗余，因为我们可以获得两个分区，它们仅在最后两组的顺序上有所不同。
- 为了解决这个问题，我们可以通过计算每个大小的组数来表示一个分区。在本例中，p对应于p_scheme = [(3, 1), (2, 2)]。该函数get_partitions_helper接受一个列表lst和一个“分区方案” p_scheme，并返回所有对应的分区而不重复计算。这是get_partitions_same_size使用第二步的地方。
最后，一切都汇集在一起get_partitions：我们遍历整数分区len(lst)并返回与每个可能的整数分区对应的所有可能的列表分区。

这是一个有趣的问题，非常欢迎对错误和优化发表评论。

from itertools import combinations
from collections import Counter

# from this thread:
# https://stackoverflow.com/questions/10035752/elegant-python-code-for-integer-partitioning
def partitions(n, I=1):
    yield (n,)
    for i in range(I, n//2 + 1):
        for p in partitions(n-i, i):
            yield (i,) + p


def split(lst, n):
  '''
  return all ways to split lst into two groups, 
  with n and len(lst) - n elements respectively
  '''
  assert len(lst) >= n
  # handle special case of len(lst) == 2 * n
  if len(lst) == 2 * n:
    for first, second in split(lst[1:], n-1):
      yield [lst[0], *first], second
  else:
    for comb in combinations(range(len(lst)), n):
      comb = set(comb)
      first = [x for i, x in enumerate(lst) if i in comb]
      second = [x for i, x in enumerate(lst) if i not in comb]
      yield first, second


def get_partitions_same_size(lst, n, k):
  # print(lst, n, k)
  'return all ways to partition lst into n parts each of size k up to order'
  if len(lst) != n * k:
    print(lst, n, k)
  assert len(lst) == n * k
  if n == 0 and len(lst) == 0:
    yield []
  # case when group size is one
  elif k == 1:
    yield [[x] for x in lst]
  # otherwise, without loss, the first element of lst goes into the first group
  else:
    for first, rest in split(lst[1:], k-1):
      for rec_call in get_partitions_same_size(rest, n-1, k):
        yield [[lst[0], *first], *rec_call]


def get_partitions_helper(lst, p_scheme):
  """
  return all ways to partition lst into groups according to a partition scheme p_scheme

  p_scheme describes an integer partition of len(lst)
  for example, if len(lst) == 5, then possible integer partitions are:
  [(5,), (1, 4), (1, 1, 3), (1, 1, 1, 2), (1, 1, 1, 1, 1), (1, 2, 2), (2, 3)]
  for each, we count the number of groups of a given size
  the corresponding partition schemes are:
  [[(5, 1)],
  [(1, 1), (4, 1)],
  [(1, 2), (3, 1)],
  [(1, 3), (2, 1)],
  [(1, 5)],
  [(1, 1), (2, 2)],
  [(2, 1), (3, 1)]]
  """
  if not lst and not p_scheme:
    yield []
    return 
  assert len(lst) == sum(a * b for a, b in p_scheme)
  group_size, group_count = p_scheme[0]
  num_elts = group_size * group_count
  for first, second in split(lst, num_elts):
    for firsts in get_partitions_same_size(first, group_count, group_size):
      for seconds in get_partitions_helper(second, p_scheme[1:]):
        yield [*firsts, *seconds]


def get_partitions(lst, min_):
  """
  get all partitions of lst into groups s.t. each group has at least min_ elements
  up to order (of groups and elements within a group)
  """
  for partition in partitions(len(lst), min_):
    p_scheme = list(Counter(partition).items())
    yield from get_partitions_helper(lst, p_scheme)

score 0 · Accepted Answer

这看起来像一个带有补码的部分幂集。

您不需要定义最大值，一旦设置了最小值，它就固定了。

此外，包括完整集是一种特殊情况（从技术上讲，完整集是集合 + 一个空集，因此它违反了最小条件）

要限制组合的数量，只需停止总长度的前半部分。如果您有一个偶数输入并选择半个分区，则只计算一半的组合（使用itertools.islice）：

你可以使用：

from itertools import combinations
from math import comb

def get_partitions(iterable, minl=2):
    s = set(iterable)
    return [list(s)]+\
           [[list(c), list(s.difference(c))]
            for r in range(minl, len(s)//2+1)
            for c in ( combinations(s, r) if len(s)//2 != r else
             islice(combinations(s, r), comb(len(s),r)//2))
           ]

out = get_partitions([1,2,3,4])

输出：

[[1, 2, 3, 4],
 [[1, 2], [3, 4]],
 [[1, 3], [2, 4]],
 [[1, 4], [2, 3]]]

另一个例子：

>>> get_partitions([1,2,3,4,5,6], 1)

[[1, 2, 3, 4, 5, 6],
 [[1], [2, 3, 4, 5, 6]],
 [[2], [1, 3, 4, 5, 6]],
 [[3], [1, 2, 4, 5, 6]],
 [[4], [1, 2, 3, 5, 6]],
 [[5], [1, 2, 3, 4, 6]],
 [[6], [1, 2, 3, 4, 5]],
 [[1, 2], [3, 4, 5, 6]],
 [[1, 3], [2, 4, 5, 6]],
 [[1, 4], [2, 3, 5, 6]],
 [[1, 5], [2, 3, 4, 6]],
 [[1, 6], [2, 3, 4, 5]],
 [[2, 3], [1, 4, 5, 6]],
 [[2, 4], [1, 3, 5, 6]],
 [[2, 5], [1, 3, 4, 6]],
 [[2, 6], [1, 3, 4, 5]],
 [[3, 4], [1, 2, 5, 6]],
 [[3, 5], [1, 2, 4, 6]],
 [[3, 6], [1, 2, 4, 5]],
 [[4, 5], [1, 2, 3, 6]],
 [[4, 6], [1, 2, 3, 5]],
 [[5, 6], [1, 2, 3, 4]],
 [[1, 2, 3], [4, 5, 6]],
 [[1, 2, 4], [3, 5, 6]],
 [[1, 2, 5], [3, 4, 6]],
 [[1, 2, 6], [3, 4, 5]],
 [[1, 3, 4], [2, 5, 6]],
 [[1, 3, 5], [2, 4, 6]],
 [[1, 3, 6], [2, 4, 5]],
 [[1, 4, 5], [2, 3, 6]],
 [[1, 4, 6], [2, 3, 5]],
 [[1, 5, 6], [2, 3, 4]]]

python - 查找具有最少元素数的列表的所有无序分区

2 回答 2

Related

Reference