algorithm - Appropriate graph representation for network flow algorithms

Question

When implementing Ford-Fulkerson or Dinitz algorithms for maximum network flow there are two operations that need to be performed on the graph:

Iterate over all neighbours of a given vertex
Find the reverse edge of a given edge(this is needed for the graph modification when a flow is added along an augmenting path).

Ideally the first operation will be linear with respect to the number of neighbours and the second one should be constant. In addition the memory required for the graph representation should be linear with respect to the number of edges(note that for most practical applications of max network flow algorithms I have seen the number of edges is about linear times the number of vertices). All the estimations for the complexity for the two algorithms will only hold if the constraints above are met.

The problem is that none of the classical graph representations is able to meet the above requirements:

Adjacency Matrix

Using adjacency matrix, finding the reverse edge of a given edge can be done in constant time. However iterating over all neighbours is linear with respect to the number of all vertices and also the required memory is quadratic with respect to the number of vertices.

Edges list

Using this representation, iterating over all neighbours will not be linear with respect to the number of neighbours and also finding the reverse edge for a given edge is not constant.

Neighbours list

Using this representation we can iterate over all neighbours in linear time and also the needed memory is linear with respect to the number of edges. Still, finding the reverse edge for a given edge will be linear with respect to the number of neighbours of the destination vertex.

With a slight modification of this representation we could improve that - if instead of neighbours list we keep some binary search tree of the neighbours we could find the reverse edge with logarithmic complexity. And even better - if we use hash map instead of binary search tree we will have constant amortized complexity. Still this representation does not feel right - although still linear, a hash map has some memory overhead. Also it only has amortized constant complexity thus some operations may in fact be slower.

The question

So my question is: what graph representation is appropriate to implement maximum network flow algorithms?

score 3 · Accepted Answer

我会将 Ivaylo 的表示描述为“边缘连续”。还有一个“端点连续”表示，我相信从一个极其不科学的样本中可以得到更广泛的使用。我已经在不同的时间实现了这两个。

在没有硬数据的情况下，我的假设是端点连续表示对于通常的可疑网络流算法来说比边缘连续表示更好，因为每次扫描弧时边缘连续都会产生随机访问，并且端点连续，每次将流推到弧上（大概是在扫描弧之前）。这种表示的明显优势是它支持非对称图（与网络流不太相关）。这种表示的明显缺点是更改图的拓扑要慢得多。

该表示由两个数组组成。first，具有 n+1 个元素，存储具有指定尾部的第一条弧的索引。额外的条目是m，弧的总数，使得尾为v的弧的索引是first[v]包含到first[v+1]排他的。在 Ivaylo 的图表上，

[0] = 0->1, [1] = 0->2,
[2] = 1->0, [3] = 1->2, [4] = 1->3,
[5] = 2->0, [6] = 2->1, [7] = 2->3, [8] = 2->4,
[9] = 3->1, [10] = 3->2, [11] = 3->5,
[12] = 4->2, [13] = 4->5,
[14] = 5->3, [15] = 5->4,

数组first是

0, 2, 5, 9, 12, 14, 16.

弧本身存储在以下结构类型的 m 元素数组中。

struct Arc {
    int head;
    int capacity;
    int symmetric;
};

symmetric是对称弧的索引。

score 2 · Accepted Answer

我使用的表示在某种程度上是边缘列表和邻居列表之间的混合。它没有我知道的正式名称，所以我不会命名它。它设法满足上述所有要求，并且只需要使用数组——一种存在于大多数（如果不是全部）流行编程语言中的结构。我将c++用于说明，但代码应该很容易翻译成其他语言。对于这个答案，我将假设顶点编号0为N-1并且我们的图有M边。

我们存储的图将被定向为在处理网络流时，通常边和它的反向边具有不同的容量（这些容量加起来就是边的初始容量）。

一个边缘

当我们处理网络流算法时，每条边都有容量（cap）。此外，对于每条边，我将存储其目标 vertex( to) 和另一个我将调用的值next。我们也可以选择添加源顶点，但由于图形的表示方式，它不是必需的。我将假设所有这些值都适合一种int类型：

struct edge {
  // destination vertex
  int to;
  // capacity
  int cap;
  // next edge
  int next;
};

存储图表

我会将所有边存储在一个数组中，此外，我还有一个数组，用于存储每个顶点的邻居列表的“头”元素。我将用“头”元素命名数组first。first应该使用一些不是有效顶点编号的值进行初始化，例如 -1：

int first[N];
// in c++ we can also use memset
for (int i = 0; i < N; ++i) {
  first[i] = -1;
}

由于最大网络流量算法的实现方式，我们应该为每条边添加一个容量为 0 的反向边。出于这个原因，我们存储边的数组的大小实际上是2*M：

edge edges[M * 2];

现在我建议的表示有两个关键点：

我们形成一个给定顶点的邻居的（单个）链表，每个链表的头（第一个）元素的索引存储在数组中first。
对于每条边，我们在边的数组中添加它的反向边

将元素添加到单个链表中，因此add_edge函数中只有一个小警告 - 我们还应该添加反向边缘。为了简化代码，我假设我们有一个变量edges_num来表示我们已经添加的边数，我将使用它，就好像它是一个全局变量一样。我实现了一个add_edge函数，它接受三个参数——源顶点、目标顶点和边的容量：

int edges_num = 0;
inline void add_edge(int from, int to, int cap) {
  edges[edges_num].to = to;
  edges[edges_num].cap = cap;
  edges[edges_num].next = first[from];
  first[from] = edges_num++;

  edges[edges_num].to = from;
  edges[edges_num].cap = 0;
  edges[edges_num].next = first[to];
  first[to] = edges_num++;
}

请注意，反向边缘的容量0通常是它的初始化方式。这几乎就是我们使用此表示存储图形所需的全部内容。

一个例子

在此处输入图像描述

让我们看看这两个数组的内容first和edges将如何变化：

在添加任何边缘之前：

first:                edges:
 0  1  2  3  4  5     []
-1 -1 -1 -1 -1 -1

让我们添加容量为 7 的边 0 -> 2。我将两个步骤分开 - 添加直边和反向边：

first:                edges:
 0  1  2  3  4  5     [{to: 2, cap: 7, next: -1}]
 0 -1 -1 -1 -1 -1

现在是反向边缘：

first:                edges:
 0  1  2  3  4  5     [{to: 2, cap: 7, next: -1}, {to: 0, cap: 0, next: -1}]
 0 -1  1 -1 -1 -1

现在让我们添加 0->1（容量 5）：

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}]
 2 -1  1 -1 -1 -1

请注意，索引为 2 的边的下一个值为 0，表示 0 是源为 0 的下一条边。我将继续添加边：

2->1 容量 1：

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}, {1, 1, 1},
 2  5  4 -1 -1 -1      {2, 0, -1}]

现在快进添加 2->3(容量 11)、2->4(容量 8)、1->3(容量 4)、4->5(容量 3)和 3->5(容量 6)相同的顺序：

first:                edges:
 0  1  2  3  4  5     [{2, 7, -1}, {0, 0, -1}, {1, 5, 0}, {0, 0, -1}, {1, 1, 1},
 2 10  8 14 12 15      {2, 0, -1}, {3, 11, 4}, {2, 0, -1}, {4, 8, 6}, {2, 0, -1},
                       {3, 4, 5}, {1, 0, 7}, {5, 3, 9}, {4, 0, -1}, {5, 6, 11}, 
                       {3, 0, 13}]

希望这个例子能清楚地说明表示是如何工作的。

遍历所有邻居

对给定顶点 v 的所有邻居的迭代很简单 - 只是对单个链表的迭代：

for (int cv = first[v]; cv != -1; cv = edges[cv].next) {
   // do something
}

很明显，这个操作在邻居的数量上是线性的。

访问反向边缘

利用反向边缘总是在直边缘之后添加的事实，反向边缘索引的公式非常简单。e索引为 in的边缘的反向边缘是索引edges为的边缘e ^ 1。这适用于访问直边的反向和反向边的反向。同样，这显然是恒定的并且非常容易编码。

内存消耗

所需的内存是O(M + N)- 我们有edgessizeM*2和firstthat 的 size N。当然N < M，对于任何合理的图，所以整体内存复杂度是O(M). 此外，内存消耗将比使用哈希映射作为邻居列表的解决方案的内存消耗少得多（至少两倍）。

概括

该图形表示以尽可能高的复杂性实现了所有必需的操作，并且内存开销很小。表示的另一个优点是它仅使用大多数语言中内置的非常基本的结构 - 数组。这种结构也可以用于其他算法，但是快速访问反向边缘对于图算法特别有用。