由于这个答案中给出的现有非递归 DFS 实现似乎被破坏了,让我提供一个真正有效的实现。
我已经用 Python 编写了这个,因为我发现它非常易读并且没有实现细节(并且因为它有yield
用于实现generators的方便关键字),但它应该很容易移植到其他语言。
# a generator function to find all simple paths between two nodes in a
# graph, represented as a dictionary that maps nodes to their neighbors
def find_simple_paths(graph, start, end):
visited = set()
visited.add(start)
nodestack = list()
indexstack = list()
current = start
i = 0
while True:
# get a list of the neighbors of the current node
neighbors = graph[current]
# find the next unvisited neighbor of this node, if any
while i < len(neighbors) and neighbors[i] in visited: i += 1
if i >= len(neighbors):
# we've reached the last neighbor of this node, backtrack
visited.remove(current)
if len(nodestack) < 1: break # can't backtrack, stop!
current = nodestack.pop()
i = indexstack.pop()
elif neighbors[i] == end:
# yay, we found the target node! let the caller process the path
yield nodestack + [current, end]
i += 1
else:
# push current node and index onto stacks, switch to neighbor
nodestack.append(current)
indexstack.append(i+1)
visited.add(neighbors[i])
current = neighbors[i]
i = 0
这段代码维护了两个并行堆栈:一个包含当前路径中较早的节点,另一个包含节点堆栈中每个节点的当前邻居索引(这样我们可以在弹出节点时恢复遍历节点的邻居堆栈)。我同样可以很好地使用单个堆栈(节点,索引)对,但我认为双堆栈方法更具可读性,并且对于其他语言的用户来说可能更容易实现。
这段代码还使用了一个单独的visited
集合,它总是包含当前节点和堆栈上的任何节点,让我有效地检查一个节点是否已经是当前路径的一部分。如果您的语言碰巧有一个“有序集”数据结构,它提供了高效的类似堆栈的推送/弹出操作和高效的成员资格查询,您可以将它用于节点堆栈并摆脱单独的visited
集合。
或者,如果您为节点使用自定义的可变类/结构,您可以只在每个节点中存储一个布尔标志,以指示它是否已作为当前搜索路径的一部分被访问。当然,如果您出于某种原因希望这样做,此方法不会让您在同一个图上并行运行两个搜索。
下面是一些测试代码,演示了上面给出的函数是如何工作的:
# test graph:
# ,---B---.
# A | D
# `---C---'
graph = {
"A": ("B", "C"),
"B": ("A", "C", "D"),
"C": ("A", "B", "D"),
"D": ("B", "C"),
}
# find paths from A to D
for path in find_simple_paths(graph, "A", "D"): print " -> ".join(path)
在给定的示例图上运行此代码会产生以下输出:
A -> B -> C -> D
A -> B -> D
A -> C -> B -> D
A -> C -> D
请注意,虽然这个示例图是无向的(即它的所有边都是双向的),但该算法也适用于任意有向图。例如,删除C -> B
边(通过B
从 的邻居列表中删除C
)产生相同的输出,除了第三条路径 ( A -> C -> B -> D
),这不再可能。
附言。很容易构造像这样的简单搜索算法(以及这个线程中给出的其他算法)性能很差的图。
例如,考虑在无向图上查找从 A 到 B 的所有路径的任务,其中起始节点 A 有两个邻居:目标节点 B(除 A 之外没有其他邻居)和节点 C,它是集团的一部分n +1 个节点,如下所示:
graph = {
"A": ("B", "C"),
"B": ("A"),
"C": ("A", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"D": ("C", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"E": ("C", "D", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"F": ("C", "D", "E", "G", "H", "I", "J", "K", "L", "M", "N", "O"),
"G": ("C", "D", "E", "F", "H", "I", "J", "K", "L", "M", "N", "O"),
"H": ("C", "D", "E", "F", "G", "I", "J", "K", "L", "M", "N", "O"),
"I": ("C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "O"),
"J": ("C", "D", "E", "F", "G", "H", "I", "K", "L", "M", "N", "O"),
"K": ("C", "D", "E", "F", "G", "H", "I", "J", "L", "M", "N", "O"),
"L": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "M", "N", "O"),
"M": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "N", "O"),
"N": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "O"),
"O": ("C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N"),
}
很容易看出 A 和 B 之间的唯一路径是直接路径,但是从节点 A 开始的幼稚 DFS 将浪费 O( n !) 时间无用地探索集团内的路径,即使(对人类而言)很明显这些路径都不可能通向 B。
也可以构造具有相似属性的DAG,例如,通过让起始节点 A 连接目标节点 B 和两个其他节点 C 1和 C 2,这两个节点都连接到节点 D 1和 D 2,这两个节点都连接到 E 1和 E 2,依此类推。对于这样排列的n层节点,简单地搜索从 A 到 B 的所有路径最终会浪费 O(2 n ) 时间检查所有可能的死胡同,然后再放弃。
当然,从派系中的一个节点(C 除外)或从 DAG 的最后一层向目标节点 B 添加一条边,将创建从 A 到 B 的指数级大量可能路径,并且纯粹的局部搜索算法无法真正提前判断它是否会找到这样的边缘。因此,从某种意义上说,这种幼稚搜索的输出敏感性差是由于他们缺乏对图的全局结构的认识。
虽然有各种预处理方法(如迭代消除叶节点、搜索单节点顶点分隔符等)可用于避免其中一些“指数时间死胡同”,但我不知道有什么通用的在所有情况下都可以消除它们的预处理技巧。一个通用的解决方案是在搜索的每一步检查目标节点是否仍然可以访问(使用子搜索),如果不是,则尽早回溯——但是,这会显着减慢搜索速度(最坏的情况是,与图的大小成正比)对于许多不包含这种病态死角的图。