algorithm - 映射排序索引

Question

作为更大算法的一部分，我遇到并解决了这个问题，但我的解决方案似乎不优雅，我将不胜感激。

我有一个可以被视为笛卡尔平面上的点的对列表。我需要生成三个列表：排序后的 x 值、排序后的 y 值和一个列表，该列表将排序后的 x 值中的索引映射到排序后的 y 值中的索引，该索引对应于最初配对的 y 值。

一个具体的例子可能有助于解释。给定以下点列表：

((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

x 值的排序列表将是 (3, 4, 5, 7, 9, 15)，y 值的排序列表将是 (0, 4, 7, 7, 11, 12)。

假设基于零的索引方案，将 x 列表索引映射到其配对 y 列表索引的索引的列表将是 (2, 3, 0, 4, 5, 1)。

例如，值 7 在 x 列表中显示为索引 3。映射列表中索引 3 处的值为 4，y 列表中索引 4 处的值为 11，对应于原始配对 (7, 11)。

生成此映射列表的最简单方法是什么？

score 3 · Accepted Answer

这是一个简单的 O(nlog n) 方法：

按 x 值对对进行排序：((3, 7), (4, 7), (5, 0), (7, 11), (9, 12), (15, 4))
生成一个对列表，其中第一个分量是前一个列表中相同位置的 y 值，第二个从 0 开始增加：((7, 0), (7, 1), (0, 2), (11 , 3), (12, 4), (4, 5))
按其第一个组件（y 值）对该列表进行排序：((0, 2), (4, 5), (7, 0), (7, 1), (11, 3), (12, 4))
遍历这个列表。对于第 i 个这样的对 (y, k)，设置 yFor[k] = i。yFor[] 是将已排序 x 列表中的索引映射到已排序 y 列表中的索引的列表（嗯，数组）。
只需从步骤 1 中生成的列表中删除第二个元素，即可创建已排序的 x 列表。
通过对步骤 3 中生成的列表执行相同操作来创建已排序的 y 列表。

score 1 · Accepted Answer

我提出以下建议。生成未排序的 x 和 y 列表。

xs = [3, 15,  7, 5, 4, 9 ]
ys = [7,  4, 11, 0, 7, 12]

将每个元素转换为一个元组 - 对中的第一个是坐标，第二个是原始索引。

xs = [(3, 0), (15, 1), ( 7, 2), (5, 3), (4, 4), ( 9, 5)]
ys = [(7, 0), ( 4, 1), (11, 2), (0, 3), (7, 4), (12, 5)]

对两个列表进行排序。

xs = [(3, 0), (4, 4), (5, 3), (7, 2), ( 9, 5), (15, 1)]
ys = [(0, 3), (4, 1), (7, 0), (7, 4), (11, 2), (12, 5)]

创建一个数组，y_positions. 数组的第 n 个元素包含最初位于索引 n 处的 y 元素的当前索引。

创建一个空的index_list. 对于的每个元素xs，获取original_index元组的第二对。用于y_positions检索具有给定的 y 元素的当前索引original_index。将当前索引添加到index_list.

xs最后，从和中删除索引值ys。

这是一个示例 Python 实现。

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

#generate unsorted lists
xs, ys = zip(*points)

#pair each element with its index
xs = zip(xs, range(len(xs)))
ys = zip(ys, range(len(xs)))

#sort
xs.sort()
ys.sort()

#generate the y positions list.
y_positions = [None] * len(ys)
for i in range(len(ys)):
    original_index = ys[i][1]
    y_positions[original_index] = i

#generate `index_list`
index_list = []
for x, original_index in xs:
    index_list.append(y_positions[original_index])

#remove tuples from x and y lists
xs = zip(*xs)[0]
ys = zip(*ys)[0]

print "xs:", xs
print "ys:", ys
print "index list:", index_list

输出：

xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index list: [2, 3, 0, 4, 5, 1]

y_positions和的生成index_list是O(n)时间，所以算法整体的复杂度主要由排序步骤决定。

score 1 · Accepted Answer

谢谢你的回答。就其价值而言，我的解决方案与概述的解决方案非常相似，但正如 j_random_hacker 指出的那样，不需要地图。让我震惊的是，这个小问题似乎比乍一看更复杂，我想知道我是否遗漏了一些明显的东西。我已经将我的解决方案重新散列到 Python 中进行比较。

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

N = len(points)

# Separate the points into their x and y components, tag the values with
# their index into the points list.

# Sort both resulting (value, tag) lists and then unzip them into lists of
# sorted x and y values and the tag information.

xs, s = zip(*sorted(zip([x for (x, y) in points], range(N))))
ys, r = zip(*sorted(zip([y for (x, y) in points], range(N))))

# Generate the mapping list.

t = N * [0]

for i in range(N):
    t[r[i]] = i

index_list = [t[j] for j in s]

print "xs:", xs
print "ys:", ys
print "index_list:", index_list

输出：

xs: (3, 4, 5, 7, 9, 15)
ys: (0, 4, 7, 7, 11, 12)
index_list: [2, 3, 0, 4, 5, 1]

score 1 · Accepted Answer

我刚刚理解了 j_random_hacker 通过最初对 x 中的点进行排序来删除间接级别的含义。这可以让事情得到很好的整理。谢谢。

points = ((3, 7), (15, 4), (7, 11), (5, 0), (4, 7), (9, 12))

N = len(points)

ordered_by_x = sorted(points)
ordered_by_y = sorted(zip([y for (x, y) in ordered_by_x], range(N)))

index_list = N * [0]

for i, (y, k) in enumerate(ordered_by_y):
    index_list[k] = i

xs = [x for (x, y) in ordered_by_x]
ys = [y for (y, k) in ordered_by_y]

print "xs:", xs
print "ys:", ys
print "index_list:", index_list

algorithm - 映射排序索引

4 回答 4

Related

Reference