5

我想返回排序列表的“反向”索引。我的意思是:我有一个未排序的列表U,我通过S=sorted(U). 现在,我可以获得这样的排序索引U(idx)=S- 但我想要S(Ridx) = U.

这里有一个小例子:

U=[5,2,3,1,4]

S=sorted(U)

idx = [U.index(S[i]) for i in range(len(U))]
>>> idx
[3, 1, 2, 4, 0]

Ridx = [S.index(U[i]) for i in range(len(U))]
>>> Ridx
[4, 1, 2, 0, 3]

>>>[U[idx[i]] for i in range(len(U))] == S
True

>>>[S[Ridx[i]] for i in range(len(U))] == U
True

我需要的是一种获取 Ridx 的有效方法。

谢谢!


编辑:

好的!我对回答这个问题的两个解决方案(@Jon Clements 和@Whatang)做了一点速度测试。

剧本:

import datetime as DT
import random

U=[int(1000*random.random()) for i in xrange(pow(10,8))]

S=sorted(U)

idx = sorted(xrange(len(U)), key=U.__getitem__)

T0 = DT.datetime.now()
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)

print '==============='
T0 = DT.datetime.now()
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)

结果:

True
0:02:45.278000
===============
True
0:06:48.889000

感谢大家快速而有意义的帮助!

4

5 回答 5

5

我能想到的最有效的方法(可能没有考虑到numpy)摆脱了.indexand 可以同时用于idxand ridx

U=[5,2,3,1,4]
idx = sorted(xrange(len(U)), key=U.__getitem__)
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
# [3, 1, 2, 4, 0] [4, 1, 2, 0, 3]
于 2013-08-20T23:18:47.387 回答
2

使用 numpy 你可以做到

>>> import numpy as np
>>> U = [5, 2, 3, 1, 4]

>>> np.array(U).argsort().argsort()
array([4, 1, 2, 0, 3])
于 2013-08-20T23:40:36.613 回答
2

不完全是您要求的数据结构,但我认为这可以获得您想要的信息:

>>> sorted(x[::-1] for x in enumerate(['z', 'a', 'c', 'x', 'm']))
[('a', 1), ('c', 2), ('m', 4), ('x', 3), ('z', 0)]
于 2013-08-20T22:15:53.323 回答
1

Assuming you already have the list idx, you can do

ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]

Then for all i from 0 to len(U)

S[ridx[i]] == U[i]

You can avoid the sort if you use a dictionary:

ridx_dict = dict(zip(idx, range(len(idx))))

which can then be converted to a list:

ridx = [ ridx_dict[k] for k in range(len(idx)) ]

Thinking about permutations is the key to this problem. One way of writing down a permutation is to write all the indexes in order on one line, then on the line below write the new index of the element with that index. e.g., for your example

0 1 2 3 4
3 1 2 4 0

This second line is your idx list. You read down the columns, so the element which starts at index 0 moves to index 3, the element which starts at index 1 stays at index 1, and so on.

The inverse permutation is the ridx you're looking for. To find this, sort the lower line of the your permutation keeping columns together, then write down the new top line. So the example becomes:

4 1 2 0 3
0 1 2 3 4
于 2013-08-20T23:28:00.043 回答
0

如果我正确理解了这个问题(我没有正确理解),我认为 U.index(S[i]) 就是您要寻找的

编辑:所以我想你可以保存原始索引的字典并保持检索语法非常简单

OIDX = {U[i]: i for i in range(0, len(U))}
S = sorted(U)
OIDX[S[i]]
于 2013-08-20T22:07:54.580 回答