3

itertools.combinations()用来迭代整数元组。

我对满足某些条件的最低和的元组感兴趣:

def findLowestNiceTuple:
    for tup in itertools.combinations(range(1, 6), 2):
        if niceTuple(tup):
            return tup

生成器的默认顺序不是元素总和的顺序。例如:

>>> itertools.combinations(range(1, 6), 2)

给出一个生成器,它将产生以下元素:

[(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]

如您所见,(1, 5) 的总和大于 (2,3) 的总和。对于提前终止,我需要 order 中的元组..., (1, 4), (2, 3), (1, 5), ...

对于适度数量的组合,您可以使用以下方法解决此问题sorted()

>>> sorted(itertools.combinations(range(1, 6), 2), key=sum)
[(1, 2), (1, 3), (1, 4), (2, 3), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)]

但是,sorted()将生成器转换为完全保存在内存中的列表。这意味着它不再可以很好地扩展。类似的东西itertools.combinations(range(1, 600), 400)将不可避免地产生一个MemoryError.

有没有更方便记忆的方法来达到预期的效果?

PS:我确实意识到完全迭代我提到的最后一个序列需要很长时间,但我正在寻找的元组应该非常接近开始。如果我可以依靠订单,我可以在第一个片段中提前终止。

4

2 回答 2

3

这是我解决它的方法,使用递归函数查找总和为给定值的所有组合:

def ordered_combinations(pop, n):
    pop = sorted(pop)

    for s in range(sum(pop[:n]), sum(pop[-n:])+1):
        yield from get_sums(pop, s, n)

def get_sums(pop, s, n):
    if n == 1:
        if s in pop:
            yield [s]
        return

    for i, v in enumerate(pop):
        if sum(pop[i:i+n]) > s:
            return
        for rest in get_sums(pop[i+1:], s-v, n-1):
            rest.append(v)
            yield rest

这是它的输出示例:

>>> for c in ordered_combinations(range(1, 8), 4):
    print(c, sum(c))


[4, 3, 2, 1] 10
[5, 3, 2, 1] 11
[6, 3, 2, 1] 12
[5, 4, 2, 1] 12
[7, 3, 2, 1] 13
[6, 4, 2, 1] 13
[5, 4, 3, 1] 13
[7, 4, 2, 1] 14
[6, 5, 2, 1] 14
[6, 4, 3, 1] 14
[5, 4, 3, 2] 14
[7, 5, 2, 1] 15
[7, 4, 3, 1] 15
[6, 5, 3, 1] 15
[6, 4, 3, 2] 15
[7, 6, 2, 1] 16
[7, 5, 3, 1] 16
[6, 5, 4, 1] 16
[7, 4, 3, 2] 16
[6, 5, 3, 2] 16
[7, 6, 3, 1] 17
[7, 5, 4, 1] 17
[7, 5, 3, 2] 17
[6, 5, 4, 2] 17
[7, 6, 4, 1] 18
[7, 6, 3, 2] 18
[7, 5, 4, 2] 18
[6, 5, 4, 3] 18
[7, 6, 5, 1] 19
[7, 6, 4, 2] 19
[7, 5, 4, 3] 19
[7, 6, 5, 2] 20
[7, 6, 4, 3] 20
[7, 6, 5, 3] 21
[7, 6, 5, 4] 22

组合总是首先产生最大的值,作为我如何将它们构建为列表的工件(通过在末尾附加小值,而不是通过连接到前面)。如果您希望它们从小到大排序,您可以将这些rest.append(v); yield rest行更改为yield [v]+rest.

该代码使用yield fromPython 3.3 引入的语法。如果您使用的是不支持该功能的早期版本,则可以使用以下等效代码:

for v in get_sums(pop, s, n):
    yield v

该代码甚至可以处理您描述的从 800 个成员范围中提取的 400 个组合的极端情况。这是该计算的前 20 个结果(仅显示最大的 10 个值,因为其余的都是相同的 390 到 1),以及它们的总和:

>>> for i, v in enumerate(ordered_combinations(range(1, 800), 400)):
    if i >= 20:
        break
    print(v[:10], sum(v))


[400, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80200
[401, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80201
[402, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80202
[401, 400, 398, 397, 396, 395, 394, 393, 392, 391] 80202
[403, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80203
[402, 400, 398, 397, 396, 395, 394, 393, 392, 391] 80203
[401, 400, 399, 397, 396, 395, 394, 393, 392, 391] 80203
[404, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80204
[403, 400, 398, 397, 396, 395, 394, 393, 392, 391] 80204
[402, 401, 398, 397, 396, 395, 394, 393, 392, 391] 80204
[402, 400, 399, 397, 396, 395, 394, 393, 392, 391] 80204
[401, 400, 399, 398, 396, 395, 394, 393, 392, 391] 80204
[405, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80205
[404, 400, 398, 397, 396, 395, 394, 393, 392, 391] 80205
[403, 401, 398, 397, 396, 395, 394, 393, 392, 391] 80205
[403, 400, 399, 397, 396, 395, 394, 393, 392, 391] 80205
[402, 401, 399, 397, 396, 395, 394, 393, 392, 391] 80205
[402, 400, 399, 398, 396, 395, 394, 393, 392, 391] 80205
[401, 400, 399, 398, 397, 395, 394, 393, 392, 391] 80205
[406, 399, 398, 397, 396, 395, 394, 393, 392, 391] 80206

因为它是递归的,如果您请求 1000 个组合,此代码可能会失败(这是由于 Python 的默认递归限制)。sys.setrecursionlimit如有必要,您可以修改它的限制。

如果你对一个非常大的人口进行非常深入的研究,它也可能存在内存问题,因为get_sums在递归步骤中对人口进行切片(并因此复制)。如果您对该代码的使用仅使用ranges,您可能可以通过从 中删除该pop = sorted(pop)行来解决内存问题ordered_combinations,因为 Python 3 的range对象可以有效地切片(即range(1,100)[10:]is range(11,100))。

于 2013-02-14T02:13:01.107 回答
1

您可以通过遍历可能和的范围而不是生成组合来获得 O(1) 空间中的纯迭代器。从总和值中,产生产生它的数字对的子范围:

def sumPairs(minVal,maxVal):
    for total in range(minVal*2,maxVal*2+1):
        for a in range(max(total-maxVal,minVal),min(total-minVal,maxVal)+1):
            if a >= total-a: continue # to skip permutations
            yield (a,total-a)

输出:

for pair in sumPairs(1,6):
    print(pair,sum(pair))

(1, 2) 3
(1, 3) 4
(1, 4) 5
(2, 3) 5
(1, 5) 6
(2, 4) 6
(1, 6) 7
(2, 5) 7
(3, 4) 7
(2, 6) 8
(3, 5) 8
(3, 6) 9
(4, 5) 9
(4, 6) 10
(5, 6) 11

[编辑] n 元组的泛化

在保持 O(1) 空间复杂度的同时超越对有点棘手。我设法获得了 O(S) 的空间复杂度,其中 S 是元组的大小。所以解决方案的内存空间消耗与数字的范围无关。

相同的策略用于根据预期总和遍历组合,但是对于每个总和值,有多个组合产生相同的总和。这些组合可以通过从以最小数字开始并以将达到该总数的最大数字结束的基本组合开始来生成。这是给定总和可能的最广泛的值分布。所有其他值的组合都是通过逐渐拉低最后一个值来构建的,同时向上偏移较小的值以补偿由于减少最后一个值而导致的减少。

# generate offsets in increasing order (>=)
# to produce a total value
def getOffsets(size,total,maxValue):
    #print(size,total,maxValue)
    if not total: yield [0]*size; return
    if size == 1 and total==maxValue: yield [maxValue]; return
    while total>=0 and size*maxValue>=total:
        for prefix in getOffsets(size-1,total-maxValue,maxValue):
            yield prefix + [maxValue]
        maxValue -= 1

# generate all combinations of a range of values
# that produce a given total
def comboOfSum(total,size,minValue,maxValue):
    if size == 1: yield (total,); return
    base        = list(range(minValue,minValue+size)) # start with smallest(s)
    base[-1]    = min(total-sum(base[:-1]),maxValue)  # end with largest
    maxOffset   = base[-1]-base[-2]-1 # freedom of moving smaller values
    totalOffset = total-sum(base)     # compensate decreasing last
    minLast     = (total + size*(size-1)//2)//size # minimum to reach total
    while base[-1]>base[-2] and base[-1] >= minLast:
        for offsets in getOffsets(size-1,totalOffset,maxOffset):
            yield tuple(b+o for b,o in zip(base,offsets+[0])) # apply offsets
        base[-1]    -= 1 # decrease last value
        totalOffset += 1 # increase total to compensate for decrease
        maxOffset   -= 1 # decrease small values' freedom of movement

# generate combinations in order of target sum  
def comboBySum(size,minValue,maxValue):
    minTotal = minValue*size + size*(size-1)//2
    maxTotal = maxValue*size - size*(size-1)//2
    for total in range(minTotal,maxTotal+1):
        yield from comboOfSum(total,size,minValue,maxValue)

验证:(与排序组合相比)

size   = 4
minVal = 10
maxVal = 80

from itertools import combinations

A = list(comboBySum(size,minVal,maxVal))
B = list(sorted(combinations(range(minVal,maxVal+1),size),key=sum))

print("same content:",set(A)==set(B))               # True
print("order by sum:",[*map(sum,A)]==[*map(sum,B)]) # True

输出(小规模):

for combo in comboBySum(2,1,6):print(combo,sum(combo))

(1, 2) 3
(1, 3) 4
(1, 4) 5
(2, 3) 5
(1, 5) 6
(2, 4) 6
(1, 6) 7
(2, 5) 7
(3, 4) 7
(2, 6) 8
(3, 5) 8
(3, 6) 9
(4, 5) 9
(4, 6) 10
(5, 6) 11

输出(大规模):

for i,combo in enumerate(comboBySum(400,1,800)):
    print(*combo[:5],"...",*combo[-5:],"sum =",sum(combo))
    if i>20: break

1 2 3 4 5 ... 396 397 398 399 400 sum = 80200
1 2 3 4 5 ... 396 397 398 399 401 sum = 80201
1 2 3 4 5 ... 396 397 398 399 402 sum = 80202
1 2 3 4 5 ... 396 397 398 400 401 sum = 80202
1 2 3 4 5 ... 396 397 398 399 403 sum = 80203
1 2 3 4 5 ... 396 397 398 400 402 sum = 80203
1 2 3 4 5 ... 396 397 399 400 401 sum = 80203
1 2 3 4 5 ... 396 397 398 399 404 sum = 80204
1 2 3 4 5 ... 396 397 398 400 403 sum = 80204
1 2 3 4 5 ... 396 397 398 401 402 sum = 80204
1 2 3 4 5 ... 396 397 399 400 402 sum = 80204
1 2 3 4 5 ... 396 398 399 400 401 sum = 80204
1 2 3 4 5 ... 396 397 398 399 405 sum = 80205
1 2 3 4 5 ... 396 397 398 400 404 sum = 80205
1 2 3 4 5 ... 396 397 398 401 403 sum = 80205
1 2 3 4 5 ... 396 397 399 400 403 sum = 80205
1 2 3 4 5 ... 396 397 399 401 402 sum = 80205
1 2 3 4 5 ... 396 398 399 400 402 sum = 80205
1 2 3 4 5 ... 397 398 399 400 401 sum = 80205
1 2 3 4 5 ... 396 397 398 399 406 sum = 80206
1 2 3 4 5 ... 396 397 398 400 405 sum = 80206
1 2 3 4 5 ... 396 397 398 401 404 sum = 80206

输出(大数字范围):

for i,combo in enumerate(comboBySum(20,12345,1000000)):
    print(*combo[:5],"...",*combo[-5:],"sum =",sum(combo))
    if i>20: break

12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12364 sum = 247090
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12365 sum = 247091
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12366 sum = 247092
12345 12346 12347 12348 12349 ... 12360 12361 12362 12364 12365 sum = 247092
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12367 sum = 247093
12345 12346 12347 12348 12349 ... 12360 12361 12362 12364 12366 sum = 247093
12345 12346 12347 12348 12349 ... 12360 12361 12363 12364 12365 sum = 247093
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12368 sum = 247094
12345 12346 12347 12348 12349 ... 12360 12361 12362 12364 12367 sum = 247094
12345 12346 12347 12348 12349 ... 12360 12361 12362 12365 12366 sum = 247094
12345 12346 12347 12348 12349 ... 12360 12361 12363 12364 12366 sum = 247094
12345 12346 12347 12348 12349 ... 12360 12362 12363 12364 12365 sum = 247094
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12369 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12361 12362 12364 12368 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12361 12362 12365 12367 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12361 12363 12364 12367 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12361 12363 12365 12366 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12362 12363 12364 12366 sum = 247095
12345 12346 12347 12348 12349 ... 12361 12362 12363 12364 12365 sum = 247095
12345 12346 12347 12348 12349 ... 12360 12361 12362 12363 12370 sum = 247096
12345 12346 12347 12348 12349 ... 12360 12361 12362 12364 12369 sum = 247096
12345 12346 12347 12348 12349 ... 12360 12361 12362 12365 12368 sum = 247096
于 2021-08-31T12:53:38.430 回答