python - 如何在列表中查找所有出现的元素

Question

index()将给出列表中第一次出现的项目。是否有一个巧妙的技巧可以返回元素列表中的所有索引？

score 758 · Accepted Answer

您可以使用列表推导：

indices = [i for i, x in enumerate(my_list) if x == "whatever"]

迭代器为列表中的每个项目enumerate(my_list)生成对(index, item)。使用i, x作为循环变量目标将这些对解包到索引i和列表项x中。我们过滤到所有x符合我们标准的内容，并选择i这些元素的索引。

score 153 · Accepted Answer

虽然不是直接用于列表的解决方案，但numpy对于这类事情来说真的很出色：

import numpy as np
values = np.array([1,2,3,1,2,4,5,6,3,2,1])
searchval = 3
ii = np.where(values == searchval)[0]

返回：

ii ==>array([2, 8])

对于具有大量元素的列表（数组），这比其他一些解决方案要快得多。

score 37 · Accepted Answer

使用的解决方案list.index：

def indices(lst, element):
    result = []
    offset = -1
    while True:
        try:
            offset = lst.index(element, offset+1)
        except ValueError:
            return result
        result.append(offset)

enumerate对于大型列表，它比使用 , 的列表理解要快得多。如果您已经拥有数组，它也比numpy解决方案慢得多，否则转换成本超过速度增益（在具有 100、1000 和 10000 个元素的整数列表上测试）。

注意：基于 Chris_Rands 评论的注意事项：如果结果足够稀疏，则此解决方案比列表推导更快，但如果列表中有许多正在搜索的元素实例（超过列表的 ~15% ，在包含 1000 个整数的列表的测试中），列表理解更快。

score 24 · Accepted Answer

怎么样：

In [1]: l=[1,2,3,4,3,2,5,6,7]

In [2]: [i for i,val in enumerate(l) if val==3]
Out[2]: [2, 4]

score 13 · Accepted Answer

more_itertools.locate查找满足条件的所有项目的索引。

from more_itertools import locate


list(locate([0, 1, 1, 0, 1, 0, 0]))
# [1, 2, 4]

list(locate(['a', 'b', 'c', 'b'], lambda x: x == 'b'))
# [1, 3]

more_itertools是第三方库> pip install more_itertools。

score 10 · Accepted Answer

occurrences = lambda s, lst: (i for i,e in enumerate(lst) if e == s)
list(occurrences(1, [1,2,3,1])) # = [0, 3]

score 6 · Accepted Answer

或使用range（python 3）：

l=[i for i in range(len(lst)) if lst[i]=='something...']

对于（蟒蛇2）：

l=[i for i in xrange(len(lst)) if lst[i]=='something...']

然后（两种情况）：

print(l)

正如预期的那样。

score 5 · Accepted Answer

如果包含将列表转换为数组的时间，则有一个用于查找单个值的索引的答案，np.where这并不比列表理解快
在大多数情况下，将 a导入numpy和转换list为 a的开销numpy.array可能会导致使用numpy效率较低的选项。有必要进行仔细的时序分析。
- 但是，如果需要在上执行多个功能/操作list，将转换list为array，然后使用numpy功能可能是更快的选择。
此解决方案使用np.whereand来查找列表中所有唯一元素np.unique的索引。
- 在数组上使用np.where（包括将列表转换为数组的时间）比列表上的列表理解稍快，用于查找所有唯一元素的所有索引。
- 这已经在具有 4 个唯一值的 2M 元素列表上进行了测试，并且列表/数组的大小和唯一元素的数量会产生影响。
numpy可以在获取 numpy 数组中重复元素的所有索引的列表中找到在数组上使用的其他解决方案

import numpy as np
import random  # to create test list

# create sample list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(20)]

# convert the list to an array for use with these numpy methods
a = np.array(l)

# create a dict of each unique entry and the associated indices
idx = {v: np.where(a == v)[0].tolist() for v in np.unique(a)}

# print(idx)
{'s1': [7, 9, 10, 11, 17],
 's2': [1, 3, 6, 8, 14, 18, 19],
 's3': [0, 2, 13, 16],
 's4': [4, 5, 12, 15]}

`%timeit`

# create 2M element list
random.seed(365)
l = [random.choice(['s1', 's2', 's3', 's4']) for _ in range(2000000)]

找到一个值的索引

在具有 4 个唯一元素的 2M 元素列表中查找单个元素的索引

# np.where: convert list to array
%%timeit
a = np.array(l)
np.where(a == 's1')
[out]:
409 ms ± 41.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# list-comprehension: on list l
%timeit [i for i, x in enumerate(l) if x == "s1"]
[out]:
201 ms ± 24 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# filter: on list l
%timeit list(filter(lambda i: l[i]=="s1", range(len(l))))
[out]:
344 ms ± 36.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

查找所有值的索引

在具有 4 个唯一元素的 2M 元素列表中查找所有唯一元素的索引

# use np.where and np.unique: convert list to array
%%timeit
a = np.array(l)
{v: np.where(a == v)[0].tolist() for v in np.unique(a)}
[out]:
682 ms ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# list comprehension inside dict comprehension: on list l
%timeit {req_word: [idx for idx, word in enumerate(l) if word == req_word] for req_word in set(l)}
[out]:
713 ms ± 16.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

score 4 · Accepted Answer

所有出现的另一种解决方案（对不起，如果重复）：

values = [1,2,3,1,2,4,5,6,3,2,1]
map(lambda val: (val, [i for i in xrange(len(values)) if values[i] == val]), values)

score 4 · Accepted Answer

获取列表中一个或多个（相同）项目的所有出现和位置

使用 enumerate(alist) 您可以存储第一个元素 (n)，它是当元素 x 等于您要查找的内容时列表的索引。

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

让我们的函数 findindex

这个函数将项目和列表作为参数，并返回项目在列表中的位置，就像我们之前看到的那样。

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

输出

[1, 3, 5, 7]

简单的

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

输出：

0
4

score 4 · Accepted Answer

在 python2 中使用 filter()。

>>> q = ['Yeehaw', 'Yeehaw', 'Googol', 'B9', 'Googol', 'NSM', 'B9', 'NSM', 'Dont Ask', 'Googol']
>>> filter(lambda i: q[i]=="Googol", range(len(q)))
[2, 4, 9]

score 3 · Accepted Answer

使用`for-loop`：

enumerate带有列表理解的答案更符合pythonic，不一定更快。但是，此答案针对可能不允许使用其中某些内置功能的学生。
创建一个空列表，indices
使用来创建循环for i in range(len(x)):，它本质上是遍历索引位置列表[0, 1, 2, 3, ..., len(x)-1]
在循环中，添加 any i，其中x[i]匹配到value，到indices
- x[i] 按索引访问列表

def get_indices(x: list, value: int) -> list:
    indices = list()
    for i in range(len(x)):
        if x[i] == value:
            indices.append(i)
    return indices

n = [1, 2, 3, -50, -60, 0, 6, 9, -60, -60]
print(get_indices(n, -60))

>>> [4, 8, 9]

函数 ,get_indices是通过类型提示实现的。在这种情况下，列表 ,n是一堆ints，因此我们搜索value, 也定义为int。

使用一个`while-loop`和`.index`：

with .index,try-except用于错误处理，因为ValueError如果value不在list.

def get_indices(x: list, value: int) -> list:
    indices = list()
    i = 0
    while True:
        try:
            # find an occurrence of value and update i to that index
            i = x.index(value, i)
            # add i to the list
            indices.append(i)
            # advance i by 1
            i += 1
        except ValueError as e:
            break
    return indices

print(get_indices(n, -60))
>>> [4, 8, 9]

score 2 · Accepted Answer

您可以创建一个默认字典

from collections import defaultdict
d1 = defaultdict(int)      # defaults to 0 values for keys
unq = set(lst1)              # lst1 = [1, 2, 2, 3, 4, 1, 2, 7]
for each in unq:
      d1[each] = lst1.count(each)
else:
      print(d1)

score 2 · Accepted Answer

一个基于动态列表理解的解决方案，以防我们事先不知道哪个元素：

lst = ['to', 'be', 'or', 'not', 'to', 'be']
{req_word: [idx for idx, word in enumerate(lst) if word == req_word] for req_word in set(lst)}

结果是：

{'be': [1, 5], 'or': [2], 'to': [0, 4], 'not': [3]}

您也可以按照相同的思路考虑所有其他方式，但index()您只能找到一个索引，尽管您可以自己设置出现次数。

score 1 · Accepted Answer

如果您使用的是 Python 2，则可以通过以下方式实现相同的功能：

f = lambda my_list, value:filter(lambda x: my_list[x] == value, range(len(my_list)))

my_list您要获取其索引的列表在哪里，并且value是搜索的值。用法：

f(some_list, some_element)

score 1 · Accepted Answer

如果您需要在某些索引之间搜索所有元素的位置，您可以声明它们：

[i for i,x in enumerate([1,2,3,2]) if x==2 & 2<= i <=3] # -> [3]

score -1 · Accepted Answer

这是使用np.wherevs之间的时间性能比较list_comprehension。似乎np.where平均速度更快。

# np.where
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = np.where(temp_list==3)[0].tolist()
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 3.81469726562e-06 seconds

# list_comprehension
start_times = []
end_times = []
for i in range(10000):
    start = time.time()
    start_times.append(start)
    temp_list = np.array([1,2,3,3,5])
    ixs = [i for i in range(len(temp_list)) if temp_list[i]==3]
    end = time.time()
    end_times.append(end)
print("Took on average {} seconds".format(
    np.mean(end_times)-np.mean(start_times)))

Took on average 4.05311584473e-06 seconds

python - 如何在列表中查找所有出现的元素

17 回答 17

%timeit

找到一个值的索引

查找所有值的索引

获取列表中一个或多个（相同）项目的所有出现和位置

让我们的函数 findindex

简单的

使用for-loop：

使用一个while-loop和.index：

Related

Reference

`%timeit`

使用`for-loop`：

使用一个`while-loop`和`.index`：