ruby - 我的 Ruby 版本的 Tarjan 算法中的错误

Question

http://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm

http://en.algoritmy.net/article/44220/Tarjans-algorithm

我无法在我的 Ruby 版本的 Tarjan 强连接组件算法中找出这个错误。我得到了 Kosaraju–Sharir 算法，我的 Ruby 中的 Tarjan 算法适用于一些图。但它没有连接应该连接的2个组件---“10”和“11,12,9”

输入文件是这个有向图：http ://algs4.cs.princeton.edu/42directed/tinyDG.txt

expected: [["1"], ["0", "2", "3", "4", "5"], ["10", "11", "12", "9"], ["6", "8"], ["7"]]
got: [["1"], ["0", "2", "3", "4", "5"], ["10"], ["11", "12", "9"], ["6", "8"], ["7"]]

在这个尝试制作单个组件的最终循环中，它以“10”（堆栈上的最后一项）开始，但当前顶点（“父级”）也是“10”！这使得循环切断“10”作为一个单独的组件。为什么栈上的最新项与父节点相同？在我们收集 ["12", "11", "9"...then "10"] 之后，我希望“10”只会出现在组件的末尾。因为“10”首先出现，而不是最后出现，所以我们遇到了这个问题。我如何解决它？

  begin
    last_stack_item = stack.pop
    component << last_stack_item.name
  end while last_stack_item != parent # we're back at the root

我的红宝石代码：

    # Tarjan's algorithm to find all strongly connected components (SCCs)
    def scc_tarjan
      index = 0 # numbers nodes consecutively in the order discovered
      stack, scc, vertices = [], [], []

      # create new Struct, if not already defined
      if Struct::const_defined?("TarjanVertex")
        Struct.const_get("TarjanVertex")
      else
        Struct.new("TarjanVertex", :name, :index, :lowlink)
      end

      adj_lists.each do |v|
        # -1 means vertex is unvisited
        vertex = Struct::TarjanVertex.new(v.name, -1, -1)
        vertices << vertex  # array of all TarjanVertex objects in graph
      end
      vertices.each do |vertex|
        tarjan_dfs(vertex, scc, stack, index, vertices) if vertex.index == -1
      end
      # return nested array of all SCCs in graph
      scc
    end

  def tarjan_dfs(parent, scc, stack, index, vertices)
    # Set depth index for vertex to smallest unused index
    parent.index = index
    # lowlink is roughly the smallest index of any node known to be reachable from the vertex
    parent.lowlink = index
    index += 1
    stack << parent
    # loop through all vertices connected to parent
    adj_vertices(parent.name, adj_lists).each do |adj_vertex|
      # since adj_vertices returns array of strings,
      # must convert to TarjanVertex objects
      child = vertices.select {|v| v.name == adj_vertex}.first

      if child.index == -1  # if child vertex not yet visited
        tarjan_dfs(child, scc, stack, index, vertices) # recurse on child

        # change parent's lowlink to smaller lowlink of parent and child)
        parent.lowlink = [parent.lowlink, child.lowlink].min

      # vertex points to earlier (already visited) one in stack,
      # with lower index. thus it's the current SCC
      elsif stack.include?(child)
        parent.lowlink = [parent.lowlink, child.index].min
      end
    end

    # if a vertex's lowlink = its index here, this # cannot go any lower.
    # vertex MUST be root of the SCC.
    if parent.lowlink == parent.index
      component = []  # a single SCC

      # pop off entire SCC, one vertex at a time
      begin
        last_stack_item = stack.pop
        component << last_stack_item.name
      end while last_stack_item != parent # we're back at the root
      scc << component.sort # done with a single SCC
    end
  end

score 1 · Accepted Answer

我解决了我自己的问题！在用笔和纸完成我的代码的每个循环之后，我发现它过早地进入了顶点 4 的底部组件循环。此时 parent.lowlink 不应该等于 parent.index。我只需要更改 1 个字来解决我的问题！

我在“elsif stack.include?(child)”循环中将“child.index”更改为“child.lowlink”！这正确地丢弃了 4 的低链接以匹配顶点 6 的低链接。

从那时起 parent.lowlink != parent.index，它就不会过早地开始制作新的组件。

有趣的是，我的解决方案与我在 Tarjan 算法中找到的所有伪代码和在线代码都不同，它们都说“parent.lowlink = [parent.lowlink, child.index].min”

相反，我需要“parent.lowlink = [parent.lowlink, child.lowlink].min”

score -1 · Accepted Answer

index是depth-first-search的时间戳，意思是dfs()每次到达一个未访问的顶点，它的值应该增加1。因此，每个节点的index值应该是不同的，当算法完成时index，值应等于图中的顶点数。

但是你index作为参数传递给函数tarjan_dfs。由于它是按值传递的，因此在 dfs() 中index += 1只是更改了index. 结果，index将是dfs-tree的深度（由depth-first-search的spanning形成的树）。这是错误的来源。

所以使用全局变量$index而不是局部变量index将修复错误。事实上，问题开头列出的所有代码都index用作全局变量。

如果不想使用全局变量，还想达到同样的效果，可以使用可变对象来包装。例如：

更改index = 0为index = {value: 0},
更改parent.index = index为parent.index = index[:value],
更改parent.lowlink = index为parent.lowlink = index[:value],
更改index += 1为index[:value] += 1.

这是我的可运行 Ruby 实现，带有随机图形生成器，它将比较两个程序的输出。只是希望它会有用。

# My version:
def tarjan_scc(adj)
  n = adj.size
  dfn = Array.new(n, -1) # dfn[u] is the timestamp when dfs reached node u
  low = Array.new(n, -1) # low[u] is the lowest index that u or u's children can reach in at most one step
  index = {value: 0}
  stk, sccs = [], []
  (0...n).each do |u|
    tarjan_scc_dfs(adj, u, index, dfn, low, stk, sccs) if dfn[u] == -1
  end
  sccs.sort!
end

def tarjan_scc_dfs(adj, u, index, dfn, low, stk, sccs)
  dfn[u] = low[u] = index[:value]
  index[:value] += 1
  stk.push(u)
  adj[u].each do |v|
    if dfn[v] == -1
      tarjan_scc_dfs(adj, v, index, dfn, low, stk, sccs)
      low[u] = [low[u], low[v]].min
    elsif stk.include?(v)
      low[u] = [low[u], dfn[v]].min
    end
  end
  if dfn[u] == low[u]
    scc = []
    scc << stk.pop while stk[-1] != u
    sccs << scc.push(stk.pop).sort
  end
end


# Test version, with these two changes:
# 1) change Hash `index` to Fixnum `index`
# 2) change `low[u] = [low[u], dfn[v]].min` to `low[u] = [low[u], low[v]].min`
def tarjan_scc_dfs_test(adj, u, index, dfn, low, stk, sccs)
  dfn[u] = low[u] = index
  index += 1
  stk.push(u)
  adj[u].each do |v|
    if dfn[v] == -1
      tarjan_scc_dfs_test(adj, v, index, dfn, low, stk, sccs)
      low[u] = [low[u], low[v]].min
    elsif stk.include?(v)
      low[u] = [low[u], low[v]].min
    end
  end
  if dfn[u] == low[u]
    scc = []
    scc << stk.pop while stk[-1] != u
    sccs << scc.push(stk.pop).sort
  end
end

def tarjan_scc_test(adj)
  n = adj.size
  dfn = Array.new(n, -1)
  low = Array.new(n, -1)
  index = 0
  stk, sccs = [], []
  (0...n).each do |u|
    tarjan_scc_dfs_test(adj, u, index, dfn, low, stk, sccs) if dfn[u] == -1
  end
  sccs.sort!
end


# Randomly generate a simple direct graph with at most max_n nodes
# Nodes are number 0 to max_n - 1. Edges stored adjacent list
def generate_graph(max_n)
  @rng ||= Random.new(Time.hash)
  n = @rng.rand(1..max_n)
  ed = []
  n.times do |i|
    n.times do |j|
      ed << [i, j] if i != j
    end
  end

  ed.size.times do |i|
    j = @rng.rand(i...ed.size)
    ed[i], ed[j] = ed[j], ed[i]
  end

  adj = Array.new(n) { Array.new }
  @rng.rand(0..ed.size).times do |i|
    u, v = ed[i]
    adj[u] << v
  end
  adj
end

# Main loop: generating random graphs and test two functions until answers differ from each other.
while true
  adj = generate_graph(8)
  sccs = tarjan_scc(adj)
  sccs_test = tarjan_scc_test(adj)
  if sccs != sccs_test
    puts "Graph: "
    adj.size.times do |u|
      puts "#{u}: #{adj[u]}"
    end
    puts "Correct components output:"
    p sccs
    puts "Wrong components output by text program:"
    p sccs_test
    break
  end
end

更新：

这是您在此测试用例上修改算法的步骤（只要我理解您的算法正确）：

0->1, 1->2, 2->1, 1->0, 0->3, 3->2.

初始化：[[index=0，lowlink=0]，[index=-1，lowlink=-1]，[index=-1，lowlink=-1]，[index=-1，lowlink=-1]]
0->1: [[index=0, lowlink=0], [index=1, lowlink=1], [index=-1, lowlink=-1], [index=-1, lowlink=-1]]
1->2: [[index=0, lowlink=0], [index=1, lowlink=1], [index=2, lowlink=2], [index=-1, lowlink=-1]]
2->1: [[index=0, lowlink=0], [index=1, lowlink=1], [index=2, lowlink=1], [index=-1, lowlink=-1]]
（从节点 2 返回，现在父节点是节点 1）
1->0: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=-1, lowlink=-1]]
（注意：依次访问每条边时，节点2的lowlink为1，而不是0）
（从节点 1 返回，现在父节点是节点 0）
0->3: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=1, lowlink=1]]
3->2: [[index=0, lowlink=0], [index=1, lowlink=0], [index=2, lowlink=1], [index=1, lowlink=1]]
（节点2的lowlink为1，所以节点3的lowlink不会为0，节点3会被标记为单个SCC）
（从节点 3 返回，现在父节点是节点 0）
（将 [0, 1, 2] 标记为 SCC）

如您所见，边缘访问的顺序会产生不同的答案。但这在 tarjans-algorithm 中并不重要。

ruby - 我的 Ruby 版本的 Tarjan 算法中的错误

2 回答 2

Related

Reference