python - 在python中比较和过滤列表元素

Question

我正在寻找过滤列表中的元素。

比如说我有一个列表：

listA = ['banana', 'apple', 'appleRed', 'melon_01', 'appleGreen', 'Orange', 'melon_03']
listB = ['apple', 'melon']

现在我需要比较列表并生成一个仅包含以 listB 开头的元素名称的列表。

结果应该是：

listResult = ['apple', 'appleRed', 'melon_01', 'appleGreen', 'melon_03']

我可以在 2 for 循环中执行此操作并使用 if 循环比较。喜欢，

for item in listA:
    for fruit in listB:
        if item.startswith(fruit):
            listResult.append(item)
            break

但是，我想知道是否有任何捷径可用于此操作，因为这可能需要更多时间进行大列表比较。

score 6 · Accepted Answer

使用列表推导和any生成器：

[item for item in listA if any(item.startswith(fruit) for fruit in listB)]

或者，正如@DSM 正确建议的那样：

[item for item in listA if item.startswith(tuple(listB))]

这比第一个解决方案快得多，并且几乎与@Iguananaut 提出的正则表达式解决方案一样快（但更紧凑和可读）：

In [1]: %timeit [item for item in listA if any(item.startswith(fruit) for fruit in listB)]
100000 loops, best of 3: 4.31 us per loop

In [2]: %timeit [item for item in listA if item.startswith(tuple(listB))]
1000000 loops, best of 3: 1.56 us per loop

In [3]: %timeit filter(regex.match, listA)
1000000 loops, best of 3: 1.39 us per loop

score 2 · Accepted Answer

如果您的项目相对较少，您listB可以相当有效地将其转换为正则表达式：

import re
regex = re.compile(r'^(?:%s)' % '|'.join(listB))
filter(regex.match, listA)

这是我想到的第一件事，但我认为其他人会有其他想法。

请注意，使用列表推导的其他答案当然非常好和合理。我以为你想知道是否有办法让它稍微快一点。再次强调，对于一般情况，这种解决方案可能并不总是更快，但在这种情况下，它会稍微快一点：

In [9]: %timeit [item for item in listA if any(item.startswith(fruit) for fruit in listB)]
100000 loops, best of 3: 8.17 us per loop

In [10]: %timeit filter(regex.match, listA)
100000 loops, best of 3: 2.62 us per loop

score 1 · Accepted Answer

1

listResult = [ i for i in listA if any( i.startsWith( j ) for j in listB ) ]

于 2012-11-27T15:05:26.683 回答

python - 在python中比较和过滤列表元素

3 回答 3

Related