1

我正在遍历一个 for 循环,在列表中查找关键字匹配,然后将匹配索引编译到第三个列表。我可以将索引编译为列表列表,但我想通过它们匹配的项目进一步对子列表进行分组。

import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']

indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
    for i in my_list:
        for m in re.finditer(pat, i):
            a =list((m.start(),m.end()))
            indices.append(a)
print(indices)

这将返回:

[[0, 2], [0, 2], [1, 3]] 

试图得到:

[[0, 2], [[0, 2], [1, 3]]]

所以很明显:

[[0, 2], [1, 3]]

是上例中“cde”上的索引匹配。

4

2 回答 2

2

使索引成为字典:

import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']

indices = {}
pats = [re.compile(i) for i in keywords]
for pat in pats:
    for i in my_list:
        indices.setdefault(i, [])
        for m in re.finditer(pat, i):
            a = list((m.start(),m.end()))
            indices[i].append(a)
print(indices)

给予:

{'cde': [[0, 2], [1, 3]], 'ab': [[0, 2]]}

这是你要找的吗?

我用这段代码玩了一段时间,既然你导入了 itertools,你不妨用它来摆脱那些丑陋的嵌套 fors ;) 像这样:

import re
from itertools import product

my_list = ['ab', 'cde']
keywords = ['ab', 'cd', 'de']

indices = {}
pats = [re.compile(i) for i in keywords]

for i, pat in product(my_list, pats):
    indices.setdefault(i, [])
    for m in re.finditer(pat, i):
        indices[i].append((m.start(), m.end()))

print(indices)

不幸的是,我无法理解 Bakuriu 使用列表理解来正常工作的想法。所以现在这对我来说似乎是最好的解决方案。

于 2013-03-27T09:27:45.807 回答
0

list为每个匹配创建一个并在 this 中累积匹配list,最后将其添加到结果中:

import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']

indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
    for i in my_list:
        sublist = []
        for m in re.finditer(pat, i):
            a =list((m.start(),m.end()))
            sublist.append(a)
        indices.append(sublist)
print(indices)

或者您可以使用列表理解:

import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']

indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
    for i in my_list:
        sublist = [(m.start(), m.end()) for m in re.finditer(pat, i)]
        indices.append(sublist)
print(indices)
于 2013-03-27T09:33:11.667 回答