1

我有一个行李清单。
如果允许 N 个选择,我如何选择可以最大化我的集合的 N 个袋子?
例如

choices = [ [A,A,Z], [B,A,E], [Z,Z,B,W], [Q], ...]

如果 N = 2 我想知道...

choices[1, 2]

最大化我的集合...

set = [B, A, E, Z, W]


我正在尝试将其强制拟合为梯度下降格式,但我无法为此创建成本函数。这是一个正确/合理的方法吗?

解决这个问题的最佳方法是什么?


注意:
假设选择列表足够大,以至于不可能计算出所有可能的选择组合。
假设局部最优解是可以接受的。



缩小问题范围...
语言:Python
问题规模:~200 万个选择;1-100袋大小。

@sascha - 布景小贴士非常有帮助!

我在编写自己的脚本时采取了捷径,并从以下位置修改了一个:https ://cs.stackexchange.com/questions/16142/how-to-implement-greedy-set-cover-in-a-way-that-它在线性时间运行

from collections import defaultdict
import random
import string
random.seed(123)  # For reproducibility

size = 100
bag_size = 5
F = []
for i in range(size):
    bag = [string.ascii_uppercase[random.randint(0, 25)] for j in range(bag_size)]
    F.append(set(bag))

print('First 5 sets... of 100')
for item in F[0:5]:
    print(item)

set_freq = {}
for bag in F:
    for item in bag:
        set_freq[item] = set_freq.get(item, 0) + 1
print('Unique items:', len(set_freq))

f_copy = F.copy()  # Because F gets modified

# First prepare a list of all sets where each element appears
D = defaultdict(list)
for y,S in enumerate(F):
    for a in S:
        D[a].append(y)

L=defaultdict(set)        
# Now place sets into an array that tells us which sets have each size
for x,S in enumerate(F):
    L[len(S)].add(x)

E=[] # Keep track of selected sets
# Now loop over each set size
for sz in range(max(len(S) for S in F),0,-1):
    if sz in L:
        P = L[sz] # set of all sets with size = sz
        while len(P):
            x = P.pop()
            E.append(x)
            for a in F[x]:
                for y in D[a]:
                    if y!=x:
                        S2 = F[y]
                        L[len(S2)].remove(y)
                        S2.remove(a)
                        L[len(S2)].add(y)
print('Results...\n')
print('Indices:', E)
captured = {}
for index in E:
    for item in f_copy[index]:
        captured[item] = captured.get(item, 0) + 1
print('Unique items captured:', len(captured))


印刷...

First 5 sets... of 100
{'B', 'Y', 'N', 'I', 'C'}
{'D', 'M', 'B', 'R', 'I'}
{'K', 'F', 'B', 'R'}
{'K', 'E', 'W', 'R'}
{'H', 'A', 'F', 'N', 'Y'}
Unique items: 26
Results...

Indices: [0, 10, 48, 32, 69, 7, 2, 5]
Unique items captured: 26


我缺少的部分是......例如,如果我只能选择 3 个,我如何最大化集合覆盖率?

4

0 回答 0