0

我有以下序列(代表一棵树):

4 2
1 4
3 4
5 4
2 7
0 7
6 0

现在,我正在尝试对这个序列进行排序,这样当一个值出现在左侧(第 1 列)时,它已经出现在右侧(第 2 列)。更具体地说,排序算法的结果应该是:

1 4
3 4
5 4
4 2
2 7
6 0
0 7

显然,这在 O(n^2) 中有效,算法迭代第 1 列的每个条目,然后在第 2 列中查找相应的条目。但由于在我的场景中 n 可能非常大(> 100000),我正在寻找一种 O(n log n) 的方法来做到这一点。这甚至可能吗?

4

1 回答 1

2

假设:

我假设这也是一个有效的排序序列:

1 4
4 2
3 4
5 4
2 7
6 0
0 7

即一旦一个值在右边出现一次,它就可以出现在左边。

如果不是这种情况(即右侧的所有出现都必须在左侧的任何出现之前),请忽略“删除指向该元素的所有边”部分,并且仅在中间元素没有传入边时删除它。

算法:

如果 A 的右元素等于 B 的左元素,则构建一个图,其中每个元素 A 指向另一个元素 B。这可以使用哈希多重映射来完成:

  • 遍历元素,将每个元素A作为A.left -> A.
  • 再次遍历元素,将每个元素B与出现在下面的所有元素连接起来B.right

Perform a topological sort of the graph, giving you your result. I should be modified such that, instead of removing an edge pointing to an element, we remove all edges pointing to that element (i.e. if we already found an element containing some element on the right, we don't need to find another for that element to appear on the left).

Currently this is O(n2) running time, because there are too many edges - if we have:

(1,2),(1,2),...,(1,2),(2,3),(2,3),...,(2,3)

There are O(n2) edges.

This can be avoided by, instead of having elements point directly to each other, create an intermediate element. In the above case, 1/2 the elements will point to that element and that element will point to the other half. Then, when doing the topological sort, when we would've remove an edge to that element, we instead remove that element and all edges pointing from / to it.

Now there will be a maximum of O(n) edges, and, since topological sort can be done in linear time with respect to the elements and edges, the overall running time is O(n).

Note that it's not always possible to get a result: (1,2), (2,1).

Illustrations:

For your example (pre-optimization), we'd have:

For my example above, we'd have:

于 2013-10-10T09:01:44.387 回答