18

假设我有以下列表

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

我想找到某个长度的所有可能的子列表,其中它们不包含某个特定数字并且不会丢失数字的顺序。

例如,长度为 6 而没有 12 的所有可能的子列表是:

[1,2,3,4,5,6]
[2,3,4,5,6,7]
[3,4,5,6,7,8]
[4,5,6,7,8,9]
[5,6,7,8,9,10]
[6,7,8,9,10,11]
[13,14,15,16,17,18]

问题是我想在一个非常大的列表中完成它并且我想要最快的方法。

用我的方法更新:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
newlist = []
length = 6
exclude = 12
for i in oldlist:
   if length+i>len(oldlist):
       break
   else:
       mylist.append(oldlist[i:(i+length)]
for i in newlist:
    if exclude in i:
       newlist.remove(i)

我知道这不是最好的方法,这就是为什么我需要一个更好的方法。

4

6 回答 6

8

一个简单的、非优化的解决方案是

result = [sublist for sublist in 
        (lst[x:x+size] for x in range(len(lst) - size + 1))
        if item not in sublist
    ]

优化版:

result = []
start = 0
while start < len(lst):
    try:
        end = lst.index(item, start + 1)
    except ValueError:
        end = len(lst)
    result.extend(lst[x+start:x+start+size] for x in range(end - start - size + 1))
    start = end + 1
于 2013-06-12T09:19:36.700 回答
8

使用itertools.combinations

import itertools
mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
def contains_sublist(lst, sublst):
    n = len(sublst)
    return any((sublst == lst[i:i+n]) for i in xrange(len(lst)-n+1))
print [i for i in itertools.combinations(mylist,6) if 12 not in i and contains_sublist(mylist, list(i))]

印刷:

[(1, 2, 3, 4, 5, 6), (2, 3, 4, 5, 6, 7), (3, 4, 5, 6, 7, 8), (4, 5, 6, 7, 8, 9), (5, 6, 7, 8, 9, 10), (6, 7, 8, 9, 10, 11), (13, 14, 15, 16, 17, 18)]
于 2013-06-12T09:13:25.730 回答
2

我能想到的最简单的方法是从列表中删除排除的数字,然后用于itertools.combinations()生成所需的子列表,这具有额外的优势,它将迭代地生成子列表。

from  itertools import combinations

def combos_with_exclusion(lst, exclude, length):
    for combo in combinations((e for e in lst if e != exclude), length):
        yield list(combo)

mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

for sublist in combos_with_exclusion(mylist, 12, 6):
    print sublist

输出:

[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 7]
[1, 2, 3, 4, 5, 8]
[1, 2, 3, 4, 5, 9]
[1, 2, 3, 4, 5, 10]
[1, 2, 3, 4, 5, 11]
[1, 2, 3, 4, 5, 13]
        ...
[11, 14, 15, 16, 17, 18]
[13, 14, 15, 16, 17, 18]
于 2013-06-12T12:09:58.150 回答
2

我喜欢用小的可组合部件构建解决方案。几年编写 Haskell 对你来说是这样的。所以我会这样做......

首先,这将返回一个遍历所有子列表的迭代器,按长度升序排列,从空列表开始:

from itertools import chain, combinations

def all_sublists(l):
    return chain(*(combinations(l, i) for i in range(len(l) + 1)))

一般来说,我们不鼓励使用单字母变量名,但我认为在高度抽象的代码的短时间内,这是完全合理的做法。

(顺便说一句,要省略空列表,请range(1, len(l) + 1)改用。)

然后我们可以通过添加您的标准来解决您的问题:

def filtered_sublists(input_list, length, exclude):
    return (
        l for l in all_sublists(input_list)
        if len(l) == length and exclude not in l
    )

例如:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
length = 6
exclude = 12
newlist = filtered_sublists(old_list, length, exclude)
于 2017-11-03T18:00:43.960 回答
1

我尝试递归地创建所有可能的列表列表。depth 参数只需要从每个列表中删除的项目数。这不是滑动窗口。

代码:

def sublists(input, depth):
    output= []
    if depth > 0:
        for i in range(0, len(input)):
            sub= input[0:i] + input[i+1:]
            output += [sub]
            output.extend(sublists(sub, depth-1))
    return output

示例(以交互方式输入 python3):

sublists([1,2,3,4],1)

[[2, 3, 4], [1, 3, 4], [1, 2, 4], [1, 2, 3]]

sublists([1,2,3,4],2)

[[2, 3, 4], [3, 4], [2, 4], [2, 3], [1, 3, 4], [3, 4], [1, 4], [1, 3], [1, 2, 4], [2, 4], [1, 4], [1, 2], [1, 2, 3], [2, 3], [1, 3], [ 1, 2]]

sublists([1,2,3,4],3)

[[2, 3, 4], [3, 4], [4], [3], [2, 4], [4], [2], [2, 3], [3], [2] , [1, 3, 4], [3, 4], [4], [3], [1, 4], [4], [1], [1, 3], [3], [1] , [1, 2, 4], [2, 4], [4], [2], [1, 4], [4], [1], [1, 2], [2], [1] , [1, 2, 3], [2, 3], [3], [2], [1, 3], [3], [1], [1, 2], [2], [1] ]

一些边缘情况:

sublists([1,2,3,4],100)

[[2, 3, 4], [3, 4], [4], [3], [2, 4], [4], [2], [2, 3], [3], [2] , [1, 3, 4], [3, 4], [4], [3], [1, 4], [4], [1], [1, 3], [3], [1] , [1, 2, 4], [2, 4], [4], [2], [1, 4], [4], [1], [1, 2], [2], [1] , [1, 2, 3], [2, 3], [3], [2], [1, 3], [3], [1], [1, 2], [2], [1] ]

sublists([], 1)

[]

注意:列表的输出列表包括重复项。

于 2014-04-24T18:20:38.443 回答
0

我有一个答案,但我认为这不是最好的:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
result = []
def sub_list(lst):
    if len(lst) <= 1:
        result.append(tuple(lst))
        return
    else:
        result.append(tuple(lst))
    for i in lst:
        new_lst = lst[:]
        new_lst.remove(i)
        sub_list(new_lst)
sub_list(oldlist)
newlist = set(result)    # because it have very very very many the same
                         # sublist so we need use set to remove these also 
                         # use tuple above is also the reason 
print newlist

它会得到结果,但因为它会有很多相同的子列表,所以它需要大量的内存和大量的时间。我认为这不是一个好方法。

于 2016-06-08T08:43:12.827 回答