erlang - 在 Erlang 中用于 Dijkstra 算法的数据结构是什么？

Question

免责声明：作者是 Erlang 的新手。

想象一下，我们有一个由 1M 个节点组成的图，每个节点有 0-4 个邻居（边从每个节点发到这些邻居，所以图是有向和连通的）。

这是我选择的数据结构：

为了存储图表，我使用基于 ETS 表的 digraph。这允许快速 (O(1)) 访问节点的邻居。

对于未访问的节点列表，我使用 gb_sets:take_smallest （节点已经排序，取完后同时删除）。

对于前辈列表，我使用 dict 结构，它允许以以下方式存储前辈：{Node1,Node1_predecessor},{Node2,Node2_predecessor}。

对于访问节点的列表，我使用一个简单的列表。

问题：

当我尝试在 digraph 结构和 Unvisited_nodes 结构中更新节点的权重时，代码变得非常难以阅读和维护。将一个“对象”与需要在两个数据结构中同时更新的“字段”保持一致似乎不是正确的方法。这样做的正确方法是什么？
同样的问题是关于前辈名单的。我应该在哪里存储节点“对象”的前任“字段”？也许在图中（有向图结构）？
也许我应该根据过程和消息而不是对象（节点和边）及其字段（权重）重新考虑整个 Dijkstra 算法？

升级版：

这是基于 Antonakos 建议的代码：

dijkstra(Graph,Start_node_name) ->

    io:format("dijkstra/2: start~n"),

    Paths = dict:new(),
    io:format("dijkstra/2: initialized empty Paths~n"),

    Unvisited = gb_sets:new(),
    io:format("dijkstra/2: initialized empty Unvisited nodes priority queue~n"),

    Unvisited_nodes = gb_sets:insert({0,Start_node_name,root},Unvisited),
    io:format("dijkstra/2: Added start node ~w with the weight 0 to the Unvisited nodes: ~w~n", [Start_node_name, Unvisited_nodes]),

    Paths_updated = loop_through_nodes(Graph,Paths,Unvisited_nodes),
    io:format("dijkstra/2: Finished searching for shortest paths: ~w~n", [Paths_updated]).




loop_through_nodes(Graph,Paths,Unvisited_nodes) ->
    %% We need this condition to stop looping through the Unvisited nodes if it is empty
    case gb_sets:is_empty(Unvisited_nodes) of
        false -> 
            {{Current_weight,Current_name,Previous_node}, Unvisited_nodes_updated} = gb_sets:take_smallest(Unvisited_nodes),
            case dict:is_key(Current_name,Paths) of
                false ->
                    io:format("loop_through_nodes: Found a new smallest unvisited node ~w~n",[Current_name]),

                    Paths_updated = dict:store(Current_name,{Previous_node,Current_weight},Paths),
                    io:format("loop_through_nodes: Updated Paths: ~w~n",[Paths_updated]),

                    Out_edges = digraph:out_edges(Graph,Current_name),
                    io:format("loop_through_nodes: Ready to iterate through the out edges of node ~w: ~w~n",[Current_name,Out_edges]),

                    Unvisited_nodes_updated_2 = loop_through_edges(Graph,Out_edges,Paths_updated,Unvisited_nodes_updated,Current_weight),
                    io:format("loop_through_nodes: Looped through out edges of the node ~w and updated Unvisited nodes: ~w~n",[Current_name,Unvisited_nodes_updated_2]),

                    loop_through_nodes(Graph,Paths_updated,Unvisited_nodes_updated_2);
                    true ->
                    loop_through_nodes(Graph,Paths,Unvisited_nodes_updated)
            end;
        true -> 
            Paths
    end.

loop_through_edges(Graph,[],Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: No more out edges ~n"),
    Unvisited_nodes;

loop_through_edges(Graph,Edges,Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: Start ~n"),
    [Current_edge|Rest_edges] = Edges,
    {Current_edge,Current_node,Neighbour_node,Edge_weight} = digraph:edge(Graph,Current_edge),
    case dict:is_key(Neighbour_node,Paths) of
        false ->
                    io:format("loop_through_edges: Inserting new neighbour node ~w into Unvisited nodes~n",[Current_node]),
            Unvisited_nodes_updated = gb_sets:insert({Current_weight+Edge_weight,Neighbour_node,Current_node},Unvisited_nodes),
                    io:format("loop_through_edges: The unvisited nodes are: ~w~n",[Unvisited_nodes_updated]),
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes_updated,Current_weight);
        true ->
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes,Current_weight)
    end.

score 1 · Accepted Answer

您选择的数据结构看起来不错，因此主要是在其中插入什么以及如何使它们保持最新的问题。我建议以下（我已经更改了一些名称）：

Result:到的dict映射，其中是路径的总成本，并且是路径上的前任。Node{Cost, Prev}CostNodePrev
Open: 的gb_sets结构{Cost, Node, Prev}。
具有表单边的图{EdgeCost, NextNode}。

搜索结果由表示，Result并且图表根本不更新。没有多处理或消息传递。

算法如下：

插入，{0, StartNode, Nil}在Open哪里Nil是标记路径结束的东西。
让{{Cost, Node, Prev}, Open1} = gb_sets:take_smallest(Open). 如果Node已经在，Result那么什么也不做；否则添加{Cost, Node, Prev}到，并且对于添加到Result的每个边缘{EdgeCost, NextNode}，但前提是尚未在. 继续直到集合为空。Node{Cost + EdgeCost, NextNode, Node}Open1NextNodeResultOpen1

Dijkstra 的算法要求对集合进行decrease_key()操作。Open由于这不是很容易支持的，gb_sets因此我们使用了插入元组的解决方法，NextNode即使NextNode可能Open已经存在。这就是为什么我们检查从中提取的节点Open是否已经在Result.

扩展讨论优先级队列的使用

Dijkstra 算法有多种使用优先级队列的方法。

在标准版本的 Wikipedia 中，一个节点v只插入一次，但是v当成本和前任的v改变时，它的位置会更新。

alt := dist[u] + dist_between(u, v)
if alt < dist[v]:
    dist[v] := alt
    previous[v] := u
    decrease-key v in Q

实现通常通过替换decrease-key v in Q为add v to Q. 这意味着v可以多次添加，因此算法必须检查u从队列中提取的是否尚未添加到结果中。

在我的版本中，我将上面的整个块替换为add v to Q. 因此，队列将包含更多条目，但由于它们总是按顺序提取，因此不会影响算法的正确性。如果您不想要这些额外的条目，您可以使用字典来跟踪每个节点的最低成本。

erlang - 在 Erlang 中用于 Dijkstra 算法的数据结构是什么？

1 回答 1

Related

Reference