algorithm - BFS与拓扑排序的关系

Question

拓扑排序既可以使用 DFS（边缘反转）也可以使用 queue 来完成。BFS 也可以使用队列来完成。使用队列进行 BFS 时元素的存储和检索方式与使用队列进行拓扑排序时的存储和检索方式之间是否存在任何关系。澄清将是有帮助的。谢谢。

score 2 · Accepted Answer

不，不一定有任何关系。我假设您指的是来自wikipedia/Topological_sorting#Algorithms的 Kahn 的算法，维基百科指出：

请注意，反映结果排序的非唯一性，结构 S 可以简单地是一个集合、一个队列或一个堆栈。

因此，拓扑排序的“队列”实际上是“任何集合”结构，这个集合中的排序无关紧要；它可以是任何东西。另一方面，用于 BFS 的队列是关于顺序的；这样它就可以完成它的FIFO（先进先出）任务。更改此顺序将破坏 BFS 算法。

可能还有其他基于“队列”的拓扑排序算法，其中结构是队列确实很重要。如果您询问特定的此类算法，请澄清。

编辑：感兴趣的算法被澄清为改进算法部分，与卡恩的相同。

编辑：我已经根据您链接的页面中的改进算法部分编写了一些实现拓扑排序的代码。我将它使用的集合类型设为任意作为排序函数的参数。然后我制作了几种类型的此类集合，包括堆栈、队列、随机弹出集合和 python 集（它是一个哈希集，因此不保证顺序）。

然后我制作一个图表，并在每个集合上测试排序算法。然后我使用拓扑排序维基百科上列出的定义测试每个结果：

.. 有向图的拓扑排序（有时缩写为 topsort 或 toposort）或拓扑排序是其顶点的线性排序，使得对于每条边 uv，u 在排序中排在 v 之前。

——维基百科

代码是用python编写的，如下所示。结果来自http://ideone.com。_ 我不知道生成随机 DAG 以进行测试的好方法，所以我的测试图很蹩脚。随意评论/编辑一个好的 DAG 生成器。

编辑：现在我有一个不那么蹩脚的生成器，但它使用 networkx。该函数nx_generate_random_dag在代码中，但它在函数中导入了networkx。您可以取消注释 main 中的标记部分以生成图表。我将生成的图形硬编码到代码中，因此我们得到了更有趣的结果。

所有这些都是为了表明，“集合”数据结构（算法中的队列）的顺序可以是任何顺序。

from collections import deque
import random


def is_topsorted(V,E,sequence):
  sequence = list(sequence)
  #from wikipedia definition of top-sort
  #for every edge uv, u comes before v in the ordering
  for u,v in E:
    ui = sequence.index(u)
    vi = sequence.index(v)
    if not (ui < vi):
      return False
  return True 

#the collection_type should behave like a set:
# it must have add(), pop() and __len__() as members.
def topsort(V,E,collection_type):
  #out edges
  INS = {}

  #in edges
  OUTS = {}
  for v in V:
    INS[v] = set()
    OUTS[v] = set()

  #for each edge u,v,
  for u,v in E:
    #record the out-edge from u
    OUTS[u].add(v)
    #record the in-edge to v
    INS[v].add(u)

  #1. Store all vertices with indegree 0 in a queue
  #We will start
  topvertices = collection_type()

  for v,in_vertices in INS.iteritems():
    if len(in_vertices) == 0:
      topvertices.add(v)

  result = []

  #4. Perform steps 2 and 3 while the queue is not empty.
  while len(topvertices) != 0:  
    #2. get a vertex U and place it in the sorted sequence (array or another queue).
    u = topvertices.pop()
    result.append(u)

    #3. For all edges (U,V) update the indegree of V,
    # and put V in the queue if the updated indegree is 0.

    for v in OUTS[u]:
      INS[v].remove(u)
      if len(INS[v]) == 0:
        topvertices.add(v)

  return result

class stack_collection:
  def __init__(self):
    self.data = list()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.pop()
  def __len__(self):
    return len(self.data)

class queue_collection:
  def __init__(self):
    self.data = deque()
  def add(self,v):
    self.data.append(v)
  def pop(self):
    return self.data.popleft()
  def __len__(self):
    return len(self.data)

class random_orderd_collection:
  def __init__(self):
    self.data = []
  def add(self,v):
    self.data.append(v)
  def pop(self):    
    result = random.choice(self.data)
    self.data.remove(result)
    return result
  def __len__(self):
    return len(self.data)

"""
Poor man's graph generator.
Requires networkx.

Don't make the edge_count too high compared with the vertex count,
 otherwise it will run for a long time or forever.
"""
def nx_generate_random_dag(vertex_count,edge_count):
  import networkx as nx

  V = range(1,vertex_count+1)
  random.shuffle(V)

  G = nx.DiGraph()
  G.add_nodes_from(V)

  while nx.number_of_edges(G) < edge_count:

    u = random.choice(V)
    v = random.choice(V)
    if u == v:
      continue

    for tries in range(2):
      G.add_edge(u,v)
      if not nx.is_directed_acyclic_graph(G):
        G.remove_edge(u,v)
        u,v = v,u
  V = G.nodes()
  E = G.edges()

  assert len(E) == edge_count
  assert len(V) == vertex_count
  return V,E




def main():

  graphs = []

  V = [1,2,3,4,5]
  E = [(1,2),(1,5),(1,4),(2,4),(2,5),(3,4),(3,5)]

  graphs.append((V,E))

  """
  Uncomment this section if you have networkx.
  This will generate 3 random graphs.
  """
  """
  for i in range(3):
    G = nx_generate_random_dag(30,120)
    V,E = G
    print 'random E:',E
    graphs.append(G)
  """


  #This graph was generated using nx_generate_random_dag() from above
  V = range(1,31)
  E = [(1, 10), (1, 11), (1, 14), (1, 17), (1, 18), (1, 21), (1, 23),
       (1, 30), (2, 4), (2, 12), (2, 15), (2, 17), (2, 18), (2, 19),
       (2, 25), (3, 22), (4, 5), (4, 8), (4, 22), (4, 23), (4, 26),
       (5, 27), (5, 23), (6, 24), (6, 28), (6, 27), (6, 20), (6, 29),
       (7, 3), (7, 19), (7, 13), (8, 24), (8, 10), (8, 3), (8, 12),
       (9, 4), (9, 8), (9, 10), (9, 14), (9, 19), (9, 27), (9, 28),
       (9, 29), (10, 18), (10, 5), (10, 23), (11, 27), (11, 5),
       (12, 10), (13, 9), (13, 26), (13, 3), (13, 12), (13, 6), (14, 24),
       (14, 28), (14, 18), (14, 20), (15, 3), (15, 12), (15, 17), (15, 19),
       (15, 25), (15, 27), (16, 4), (16, 5), (16, 8), (16, 18), (16, 20), (16, 23),
       (16, 26), (16, 28), (17, 4), (17, 5), (17, 8), (17, 12), (17, 22), (17, 28),
       (18, 11), (18, 3), (19, 10), (19, 18), (19, 5), (19, 22), (20, 5), (20, 29),
       (21, 25), (21, 12), (21, 30), (21, 17), (22, 11), (24, 3), (24, 10),
       (24, 11), (24, 28), (25, 10), (25, 17), (25, 23), (25, 27), (26, 3),
       (26, 18), (26, 19), (28, 26), (28, 11), (28, 23), (29, 2), (29, 4),
       (29, 11), (29, 15), (29, 17), (29, 22), (29, 23), (30, 3), (30, 7),
       (30, 17), (30, 20), (30, 25), (30, 26), (30, 28), (30, 29)]

  graphs.append((V,E))

  #add other graphs here for testing


  for G in graphs:
    V,E = G

    #sets in python are unordered but in practice their hashes usually order integers.
    top_set = topsort(V,E,set)

    top_stack = topsort(V,E,stack_collection)

    top_queue = topsort(V,E,queue_collection)

    random_results = []
    for i in range(0,10):
      random_results.append(topsort(V,E,random_orderd_collection))

    print
    print 'V: ', V
    print 'E: ', E
    print 'top_set ({0}): {1}'.format(is_topsorted(V,E,top_set),top_set)
    print 'top_stack ({0}): {1}'.format(is_topsorted(V,E,top_stack),top_stack)
    print 'top_queue ({0}): {1}'.format(is_topsorted(V,E,top_queue),top_queue)

    for random_result in random_results:
      print 'random_result ({0}): {1}'.format(is_topsorted(V,E,random_result),random_result)
      assert is_topsorted(V,E,random_result)

    assert is_topsorted(V,E,top_set)
    assert is_topsorted(V,E,top_stack)
    assert is_topsorted(V,E,top_queue)



main()

score 2 · Accepted Answer

BFS从源节点逐级遍历，使得节点按照与源的距离顺序出现，这也意味着父节点出现在其下一级的子节点之前。

这可能看起来像我们在拓扑排序中所需要的，但是，请继续关注我。上一句中的下一个级别是关键，因为如果一个节点及其子节点与源处于同一级别，那么 BFS 不会强制执行遍历它们的顺序，这意味着它可能会将节点的子节点呈现在自身之前，这将是直接违规对于拓扑排序的规则，当我们想要拓扑排序时，排序确实很重要。

尽管 BFS 和拓扑排序之间似乎存在某种关系，但这种关系相当薄弱。

algorithm - BFS与拓扑排序的关系

2 回答 2

Related

Reference