python - 访问列表的多个元素知道它们的索引

Question

我需要从给定列表中选择一些元素，知道它们的索引。假设我想创建一个新列表，其中包含来自给定列表 [-2, 1, 5, 3, 8, 5, 6] 的索引为 1、2、5 的元素。我所做的是：

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [ a[i] for i in b]

有没有更好的方法呢？类似 c = a[b] 的东西？

score 284 · Accepted Answer

您可以使用operator.itemgetter：

from operator import itemgetter 
a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
print(itemgetter(*b)(a))
# Result:
(1, 5, 5)

或者你可以使用numpy：

import numpy as np
a = np.array([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
print(list(a[b]))
# Result:
[1, 5, 5]

但实际上，您当前的解决方案很好。它可能是所有这些中最整洁的。

score 61 · Accepted Answer

备择方案：

>>> map(a.__getitem__, b)
[1, 5, 5]

>>> import operator
>>> operator.itemgetter(*b)(a)
(1, 5, 5)

score 14 · Accepted Answer

另一种解决方案可能是通过 pandas 系列：

import pandas as pd

a = pd.Series([-2, 1, 5, 3, 8, 5, 6])
b = [1, 2, 5]
c = a[b]

然后，您可以根据需要将 c 转换回列表：

c = list(c)

score 9 · Accepted Answer

比较提供的五个答案的执行时间的基本且不是非常广泛的测试：

def numpyIndexValues(a, b):
    na = np.array(a)
    nb = np.array(b)
    out = list(na[nb])
    return out

def mapIndexValues(a, b):
    out = map(a.__getitem__, b)
    return list(out)

def getIndexValues(a, b):
    out = operator.itemgetter(*b)(a)
    return out

def pythonLoopOverlap(a, b):
    c = [ a[i] for i in b]
    return c

multipleListItemValues = lambda searchList, ind: [searchList[i] for i in ind]

使用以下输入：

a = range(0, 10000000)
b = range(500, 500000)

简单的 python 循环是最快的，lambda 操作紧随其后，mapIndexValues 和 getIndexValues 与 numpy 方法一直非常相似，在将列表转换为 numpy 数组后明显变慢。如果数据已经在 numpy 数组中，则删除 numpy.array 转换的 numpyIndexValues 方法是最快的。

numpyIndexValues -> time:1.38940598 (when converted the lists to numpy arrays)
numpyIndexValues -> time:0.0193445 (using numpy array instead of python list as input, and conversion code removed)
mapIndexValues -> time:0.06477512099999999
getIndexValues -> time:0.06391049500000001
multipleListItemValues -> time:0.043773591
pythonLoopOverlap -> time:0.043021754999999995

score 5 · Accepted Answer

这是一个更简单的方法：

a = [-2,1,5,3,8,5,6]
b = [1,2,5]
c = [e for i, e in enumerate(a) if i in b]

score 3 · Accepted Answer

我确信这已经被考虑过：如果 b 中的索引数量很小且恒定，则可以将结果写成：

c = [a[b[0]]] + [a[b[1]]] + [a[b[2]]]

或者更简单，如果索引本身是常数......

c = [a[1]] + [a[2]] + [a[5]]

或者如果有一个连续的索引范围......

c = a[1:3] + [a[5]]

score 2 · Accepted Answer

我的回答不使用 numpy 或 python 集合。

查找元素的一种简单方法如下：

a = [-2, 1, 5, 3, 8, 5, 6]
b = [1, 2, 5]
c = [i for i in a if i in b]

缺点：此方法可能不适用于较大的列表。对于较大的列表，建议使用 numpy。

score 1 · Accepted Answer

1

一种pythonic方式：

c = [x for x in a if a.index(x) in b]

于 2020-03-26T18:55:53.670 回答

score 0 · Accepted Answer

静态索引和小列表？

不要忘记，如果列表很小并且索引没有改变，就像你的例子一样，有时最好的办法是使用序列解包：

_,a1,a2,_,_,a3,_ = a

性能好很多，还可以省下一行代码：

 %timeit _,a1,b1,_,_,c1,_ = a
10000000 loops, best of 3: 154 ns per loop 
%timeit itemgetter(*b)(a)
1000000 loops, best of 3: 753 ns per loop
 %timeit [ a[i] for i in b]
1000000 loops, best of 3: 777 ns per loop
 %timeit map(a.__getitem__, b)
1000000 loops, best of 3: 1.42 µs per loop

score 0 · Accepted Answer

列表理解显然是最直接和最容易记住的——除了非常pythonic！

无论如何，在提出的解决方案中，它并不是最快的（我已经使用 Python 3.8.3 在 Windows 上运行了我的测试）：

import timeit
from itertools import compress
import random
from operator import itemgetter
import pandas as pd

__N_TESTS__ = 10_000

vector = [str(x) for x in range(100)]
filter_indeces = sorted(random.sample(range(100), 10))
filter_boolean = random.choices([True, False], k=100)

# Different ways for selecting elements given indeces

# list comprehension
def f1(v, f):
   return [v[i] for i in filter_indeces]

# itemgetter
def f2(v, f):
   return itemgetter(*f)(v)

# using pandas.Series
# this is immensely slow
def f3(v, f):
   return list(pd.Series(v)[f])

# using map and __getitem__
def f4(v, f):
   return list(map(v.__getitem__, f))

# using enumerate!
def f5(v, f):
   return [x for i, x in enumerate(v) if i in f]

# using numpy array
def f6(v, f):
   return list(np.array(v)[f])

print("{:30s}:{:f} secs".format("List comprehension", timeit.timeit(lambda:f1(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Operator.itemgetter", timeit.timeit(lambda:f2(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using Pandas series", timeit.timeit(lambda:f3(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Using map and __getitem__", timeit.timeit(lambda: f4(vector, filter_indeces), number=__N_TESTS__)))
print("{:30s}:{:f} secs".format("Enumeration (Why anyway?)", timeit.timeit(lambda: f5(vector, filter_indeces), number=__N_TESTS__)))

我的结果是：

列表理解：0.007113 秒
Operator.itemgetter：0.003247 秒
使用 Pandas 系列：2.977286 秒
使用 map 和 getitem：0.005029 秒
枚举（为什么？）：0.135156 秒
Numpy：0.157018 秒

python - 访问列表的多个元素知道它们的索引

10 回答 10

静态索引和小列表？

Related

Reference