algorithm - 将多个任意排序的列表合并为一个列表

Question

给定 3 个列表，它们按相同但未知的排序顺序任意排序。是否有一种算法可以将这些列表合并为一个，然后仍然按相同的顺序排序？

例子：

清单 1：a b c f h

清单 2：b c e h

清单 3：c d e f

假设这些列表已排序，但使用的排序顺序未知。我想将这些列表组合成一个不包含重复但仍保持排序顺序的结果：a b c d e f h

如上所述：已知给定列表已排序，但不知道按哪个顺序排序，但要求是合并后的列表仍按相同（未知）顺序排序。

在上面的示例中，我知道元素“f”位于“e”和“h”之间，因为从 List1 我知道

“c” < “f” < “h”，

从 List2 我知道

“c”<“e”<“h”

从 List3 我知道

“e” < “f” 和 “c” < “e”

它结合了：

“c” < “e” < “f” < “h”

如果任何给定列表都无法确定排序顺序，则允许仅将元素附加到结果列表的末尾。此外，如果无法确定元素序列的排序顺序，只要它们在正确的位置（例如，如果我知道“b”和“c”必须是在“a”和“d”之间插入，但我不知道应该是abcd还是acbd，那么两者都是允许的。）

当然这只是一个例子。真正的列表更长（但包含少于 100 个元素），包含的不是单个而是多个字符元素，并且排序顺序不是字母顺序。此外，我有多达 5 个列表。

我需要在 Delphi 中实现这个算法（不：这不是家庭作业，而是一个现实生活中的问题），但是我采用一种语言的算法，只要它不包含太多的编译器魔法或复杂的库函数。

性能不是什么大问题，因为这是一次完成。

score 8 · Accepted Answer

您的输入列表定义了项目的部分顺序。根据Math.SE 的回答，您想要的是拓扑排序。算法在 Wikipedia 上有描述。

score 4 · Accepted Answer

好问题。尽管拓扑排序可能是最推荐的方法，但您必须首先解析输入以构建依赖关系列表。我想到了一种更直接的方法，基于查找出现在多个列表中的项目来设置订单定义。

我无法预测任何时间复杂度，但由于您不关心性能，特别是考虑到项目总数最多为 500，我认为这个算法应该可以很好地工作。

算法

所有列表都放在一个临时列表中，然后自然排序以识别和筛选所有重复项。那些名为Keys的重复项构成了最终排序顺序的唯一定义。
Key 列表通过比较每两个项目按输入排序顺序进行排序：如果两个 Key 出现在同一个输入列表中，则该列表中的第一个也排在输出列表中的第二个之前。如果两个键在任何输入列表中不同时出现，则认为它们相等。
随后，一个循环在密钥上循环。
在每个循环中，在每个输入列表中，前一个 Key 和当前 Key 之间的每个项目都被添加到输出列表中。一个循环以当前密钥的添加结束。

执行

type
  TSorterStringList = class(TStringList)
  protected
    Id: Integer;
    KeyId: Integer;
    function Current: String;
  public
    constructor Create;
  end;

  TSorterStringLists = class(TObjectList)
  private
    function GetItem(Index: Integer): TSorterStringList;
  public
    property Items[Index: Integer]: TSorterStringList read GetItem; default;
  end;

  TSorter = class(TObject)
  private
    FInput: TSorterStringLists;
    FKeys: TStringList;
    procedure GenerateKeys;
    function IsKey(const S: String): Boolean;
  public
    constructor Create;
    destructor Destroy; override;
    procedure Sort(Output: TStrings);
    property Input: TSorterStringLists read FInput;
  end;

{ TSorterStringList }

constructor TSorterStringList.Create;
begin
  inherited Create;
  KeyId := -1;
end;

function TSorterStringList.Current: String;
begin
  Result := Strings[Id];
end;

{ TSorterStringLists }

function TSorterStringLists.GetItem(Index: Integer): TSorterStringList;
begin
  if Index >= Count then
    Count := Index + 1;
  if inherited Items[Index] = nil then
    inherited Items[Index] := TSorterStringList.Create;
  Result := TSorterStringList(inherited Items[Index]);
end;

{ TSorter }

constructor TSorter.Create;
begin
  inherited Create;
  FInput := TSorterStringLists.Create(True);
  FKeys := TStringList.Create;
end;

destructor TSorter.Destroy;
begin
  FKeys.Free;
  FInput.Free;
  inherited Destroy;
end;

threadvar
  CurrentSorter: TSorter;

function CompareKeys(List: TStringList; Index1, Index2: Integer): Integer;
var
  Input: TSorterStringLists;
  I: Integer;
  J: Integer;
  K: Integer;
begin
  Result := 0;
  Input := CurrentSorter.Input;
  for I := 0 to Input.Count - 1 do
  begin
    J := Input[I].IndexOf(List[Index1]);
    K := Input[I].IndexOf(List[Index2]);
    if (J > - 1) and (K > -1) then
    begin
      Result := J - K;
      Break;
    end;
  end;
end;

procedure TSorter.GenerateKeys;
var
  All: TStringList;
  I: Integer;
begin
  All := TStringList.Create;
  try
    All.Sorted := True;
    All.Duplicates := dupAccept;
    for I := 0 to FInput.Count - 1 do
      All.AddStrings(FInput[I]);
    for I := 0 to All.Count - 2 do
      if (All[I] = All[I + 1]) then
        if (FKeys.Count = 0) or (FKeys[FKeys.Count - 1] <> All[I]) then
          FKeys.Add(All[I]);
  finally
    All.Free;
  end;
  CurrentSorter := Self;
  FKeys.CustomSort(CompareKeys);
end;

function TSorter.IsKey(const S: String): Boolean;
begin
  Result := FKeys.IndexOf(S) > -1;
end;

procedure TSorter.Sort(Output: TStrings);
var
  KeyId: Integer;
  I: Integer;
  List: TSorterStringList;
begin
  if FInput.Count = 0 then
    Exit;
  Output.BeginUpdate;
  try
    GenerateKeys;
    for KeyId := -1 to FKeys.Count - 1 do
    begin
      for I := 0 to FInput.Count - 1 do
      begin
        List := FInput[I];
        if List.KeyId <= KeyId then
          while (List.Id < List.Count) and not IsKey(List.Current) do
          begin
            Output.Add(List.Current);
            Inc(List.Id);
          end;
        while (List.Id < List.Count) and IsKey(List.Current) do
        begin
          List.KeyId := FKeys.IndexOf(List.Current);
          Inc(List.Id);
        end;
      end;
      if KeyId + 1 < FKeys.Count then
        Output.Add(FKeys[KeyId + 1]);
    end;
  finally
    Output.EndUpdate;
  end;
end;

示例用法

procedure TForm1.Button1Click(Sender: TObject);
var
  Sorter: TSorter;
begin
  Sorter := TSorter.Create;
  try
    Sorter.Input[0].CommaText := '1, 2, 4, 9, 10, 11, 22, 46, 48, 51, 70, 72';
    Sorter.Input[1].CommaText := '3, 9, 23, 43, 44, 45, 47, 48, 51, 71, 90, 91';
    Sorter.Input[2].CommaText := '0, 3, 4, 7, 8, 11, 23, 50, 51, 52, 55, 70';
    Sorter.Input[3].CommaText := '2, 6, 14, 15, 36, 37, 38, 39, 51, 65, 66, 77';
    Sorter.Input[4].CommaText := '5, 27, 120, 130';
    ListBox1.Items.Assign(Sorter.Input[0]);
    ListBox2.Items.Assign(Sorter.Input[1]);
    ListBox3.Items.Assign(Sorter.Input[2]);
    ListBox4.Items.Assign(Sorter.Input[3]);
    ListBox5.Items.Assign(Sorter.Input[4]);
    Sorter.Sort(ListBox6.Items);
    // Results in:
    // 1, 0, 5, 27, 120, 130, 3, 2, 6, 14, 15, 36, 37, 38, 39, 4, 7, 8, 9, 10,
    // 11, 22, 46, 23, 43, 44, 45, 47, 50, 48, 51, 71, 90, 91, 52, 55, 65, 66,
    // 77, 70, 72
  finally
    Sorter.Free;
  end;
end;

score 2 · Accepted Answer

所以你有了

List1: a b c f h
List2: b c e h
List3: c d e f

逐个列出并输入图表。所以在第一个列表之后你有：

A -> B -> C -> F -> H

然后你从清单 2 开始。B 已经在里面了。然后你会看到 B 连接到你已经知道的 C。然后你知道 C 连接到 E ，它还不在那里，所以你现在有：

A -> B -> C -> F -> H
          |          
          E

然后你知道 E 连接到 H 所以：

A -> B -> C -> F -> H
          |         ^ 
          E --------|

然后你转到列表 3。你知道 C 在那里并且它指向 D：

          D
          ^
          |
A -> B -> C -> F -> H
          |         ^ 
          E --------|

然后你知道 D 指向 E。由于 C->E 与 C -> D -> E 具有相同的祖先，你可以断开从 C-> E 的链接，因此你现在有：

          D -> E ---|
          ^         |
          |         |
A -> B -> C -> F -> H

最后你知道 E 在 F 之前。因为你之前知道 E 直接通向 H，现在从 E 到 H 的另一条路径（E->F->H）你知道 F 必须在 E 和 H 和你之间可以从 E -> H 中删除链接。因此您现在拥有：

          D -> E
          ^    |     
          |    |     
A -> B -> C -> F -> H

然后你知道可以缩短为

A -> B -> C -> D -> E -> F -> H

现在让我们假设你最终得到了类似的东西：

  E -> T
  |    |
  A -> Z
  |    ^
  R -> W

您没有足够的信息来判断 E/T 是否在 R/W 之前，但您知道两者都在 Z 之前和 A 之后。因此，您只需随机选择其中一条路径，然后选择下一条，依此类推，因此您最终可能会选择 AETRWZ 或 ARTETZ。你甚至可以从每条路径中随机取一个，这将保证这些腿仍然会被排序，也许你会很幸运你的合并也被排序了。因此，您可以拥有 AREWTZ，它的 E/T 相对排序和 R/W 仍然相对排序，或者如果您从 E 腿开始，您会很幸运并拥有 AERTWZ

score 1 · Accepted Answer

图论似乎是一个很好的第一直觉。

您可以构建一个有向图，其中列表的元素是顶点，然后将有向边从每个列表元素插入到其后继元素。那么一个节点 A小于另一个节点 B 当且仅当 B 可以通过遍历图从 A 到达。

图中的循环（A 小于 B 且 B 小于 A）表示输入数据损坏或存在两个具有不同名称的等效元素。

在没有循环的情况下，在给定的小于关系下合并应该很简单：从图中反复删除任何其他节点无法到达的节点，并将它们添加到输出列表中。

score 0 · Accepted Answer

可以使用哈希表吗？这是一种合并两个这样的列表的算法。

T = new HashMap
for(i = 1 to length B)
  T.put(B[i],i)
N = array[length A]
for(i = 1 to length A){
  if(T containsKey A[i])
    N[i] = T.get(A[i])
  else
    N[i] = -1
}

R = array[length A + length B]
j = 1
k = 1
for(i = 1 to length A){
  if(N[i] = -1)
    R[j++] = N[i]
  else{
    while(k <= N[i])
      R[j++] = B[k++]
  }
}
while(k <= length B)
  R[j++] = B[k++]
return R[1 ... j-1]

元素 A[i] 其中 N[i]>0 匹配 B 的元素，其他元素将按有效顺序放置。那里可能有一个错误，但这是一般的想法。

~~对于合并三个数组，您可以合并前两个，然后将第三个合并到合并的数组中。~~ 正如@RobKennedy 所指出的，在编辑时，最后一句话是错误的。您可能可以更改算法以处理三个列表，但这并不那么简单，因此您可能希望使用拓扑排序。

algorithm - 将多个任意排序的列表合并为一个列表

5 回答 5

算法

执行

示例用法

Related

Reference