1

我有一个来自表查找的 int 迭代器,需要检查它们的 multiset 是否包含在给定的固定 "multiset"ms中。目前,我ms在开头排序,然后在我的迭代器中对 int 进行排序,并检查(排序列表的)多重集包含,如下所示:

vals = sorted(my_vals)
for it in ... :
    test_vals = sorted( i for i in it )
    if is_sublist(test_vals,vals):
        # do stuff

在哪里

def is_sublist(L1,L2):
    m = len(L1)
    n = len(L2)
    i = j = 0
    while j <= n:
        if i == m:
            return True
        elif j == n:
            return False
        a,b = L1[i], L2[j]
        if a == b:
            i += 1
            j+= 1
        elif a > b:
            j += 1
        else:
            return False
  • 通常,我的列表很短(1--20 个元素)
  • 我尝试使用Counter,但其初始化的时间劣势比包含测试的时间优势更糟糕。
  • 我这样做〜10 ^ 6次,所以我应该这样做cython

任何想法或指示都会很好 - 谢谢!(很抱歉先过早点击发帖按钮...)

4

1 回答 1

0
# edit: second attempt in response to Bakuriu's comment
#
from collections import Counter
from itertools import groupby
multiset = Counter(sorted(vals)) # only create one Counter object
for it in ...:
    grouped = groupby(sorted(it))
    if all(len(list(g)) <= multiset[k] for k, g in grouped):
        # do stuff



from operator import eq
# if you are using Python 2
from itertools import izip as zip
from itertools import starmap

vals = sorted(my_vals)
for it in ...:
    test_vals = sorted(it)
    zipped = zip(test_vals, vals)
    # check if test_vals' multiset is contained 
    # in vals' multiset but bale out as soon as
    # non-matching values are found.
    if all(starmap(eq, zipped)):
        # do stuff
于 2014-07-14T12:26:02.623 回答