4

假设我有以下图表(箭头表示连接方向)并且我想计算黑色节点簇的大小:

无向方格图。

它在内存中组织为节点列表,使得每个节点都有一个其邻居节点的列表。我想从任何节点开始计算有多少节点node[i].State == 1Node.GetClusterSize()):

public class Node
{
    public Int32 State { get; private set; } // 0 = white; 1 = black;

    public Boolean Visited { get; private set; }

    public List<Node> InputNeigh { get; private set; } // list of references to
                                                       // neighbors nodes

    public Int32 GetClusterSize()
    {
        this.Visited = true;
        if (this.State == 1)
        {
            Int32 s = 1, i = 0;
            while (i < this.InputNeigh.Count)
            {
                if (!this.InputNeigh[i].Visited)
                {
                    s += this.InputNeigh[i].GetClusterSize();
                }
                i++;
            }
            this.Visited = false; // this is an important step, I'll explain why
            return s;
        }
        else
        {
            return 0;
        }
    }
    public void Evolve() { /* doesn't matter for this question */ }
}

现在,我需要将节点标记为未访问,因为我在主模拟的每个时间步计算每个节点的集群大小(节点的状态随着时间而变化,因此集群可能会在下一个时间步改变大小)。

Node如果我有一个外部布尔列表,而不是对象中的标志,则可以轻松解决此问题,给定元素i对应于 node i: List<Boolean> nodeStatus,并将此列表作为对 function 的引用传递Node.GetClusterSize()。但是,我必须在每个时间步重置这个列表,减慢代码速度(性能很重要!)。

上述代码的失败正是在遍历其邻居后将该节点标记为未访问。使用以下树(从左到右访问并假设我首先调用node[0].GetClusterSize())可以更好地可视化这种情况:

计算集群大小的算法

深度优先搜索遍历上面树中的蓝色路径,当它到达节点时3,它知道它的所有邻居都已经被访问过,标记3为未访问并返回s = 1。由于32要访问的下一个邻居,并且3被标记为未访问(尽管它已经被访问过),它再次检查并且算法进入StackOverflow异常,或者最多返回错误的集群大小。

因此,我想出了两个想法,虽然我仍然不知道如何实现它们:

1)实现广度优先搜索算法;虽然我不知道如何将这个概念应用于所呈现的情况。

2)以顺序方式(非递归)实现深度优先搜索。尽管如此,我无法想象这是怎么可能的。

你有什么想法可以解决这个问题吗?有什么建议吗?

先感谢您!

PS:可能比这个例子大得多,网络中可能同时存在多个黑色集群,彼此断开连接。因此,这不仅仅是计算黑色元素的问题。

4

3 回答 3

9

不要改变您尝试查询的对象;这种方式是疯狂的,因为正如你所注意到的,你必须取消对象的变异。

这样看。您定义了一个关系。如果一个黑色节点另一个黑色节点之间直接存在任何边,则它们与另一个黑色节点相关。当给定一个黑色节点时,您希望计算此关系的自反和传递闭包的大小。

在您的示例中,关系似乎也是对称的,因此闭包将定义一个等价类,然后您的问题是“给定一个成员,找到其等价类的大小。

所以让我们解决更普遍的问题。

什么是关系?正如评论者指出的那样,关系正确地是一组有序对。但是将您的关系视为一个函数是很方便的,当给定一个元素时,它会为您提供与其相关的所有元素的序列。在这种情况下:给定一个黑色节点,关系函数为您提供所有相邻黑色节点的序列。

这里我们有一个非递归方法,当给定一个项目和一个关系时,它会计算该关系的传递闭包:

static HashSet<T> TransitiveClosure<T>(
    Func<T, IEnumerable<T>> relation,
    T item)
{
    var closure = new HashSet<T>();
    var stack = new Stack<T>();
    stack.Push(item); 
    while(stack.Count > 0)
    {
        T current = stack.Pop();
        foreach(T newItem in relation(current))
        {
            if (!closure.Contains(newItem))
            {
                closure.Add(newItem);
                stack.Push(newItem);
            }
        }
    } 
    return closure;
} 

请注意,这是带有循环检测的非递归深度优先遍历。


练习:您可以对该实现进行哪些简单的更改,以将其变成具有循环检测的非递归广度优先遍历?


我们很容易创建自反和传递闭包:

static HashSet<T> TransitiveAndReflexiveClosure<T>(
    Func<T, IEnumerable<T>> relation,
    T item)
{
  var closure = TransitiveClosure(relation, item);
  closure.Add(item);
  return closure;
}

练习:你的关系是对称的,这意味着当我们从一个节点 X 开始并访问一个邻居 Y 时,当我们处理 Y 时,它会将 X 放回堆栈中,并最终进入闭包。因此没有必要采取反身闭包。

前面的论点是不正确的;有必要采取反身封闭该论证中的哪个句子包含第一个错误?


现在你有了一个可以很容易调用的方法:

var cluster = TransitiveAndReflexiveClosure<Node>(
    node => from n in node.InputNeigh where n.State == node.State select n,
    someNode);

现在,如果您想要的话,您可以简单地询问集群的大小。

(并且请更改名称InputNeigh。缩写是不酷的,哟,除非你是 13 岁。)

于 2013-09-30T23:01:56.327 回答
3

顺序广度优先搜索,使用列表重置已访问标志:

public int GetClusterSize()
{
    if (State != 1) return 0;

    List<Node> cluster = new List<Node>();

    Stack<Node> stack = new Stack<Node>();
    stack.Push(this);
    while (stack.Count > 0)
    {
        Node node = stack.Pop();
        if (node.Visited) continue;

        node.Visited = true;
        cluster.Add(node);
        foreach (var neigh in node.InputNeigh)
        {
            if (neigh.State == 1 && !neigh.Visited)
            {
                stack.Push(neigh);
            }
        }
    }

    int clusterSize = cluster.Count;
    foreach (var node in cluster)
    {
        node.Visited = false;
    }

    return clusterSize;
}

另一种方法是使用生成标签而不是已访问标志。如果生成与目标匹配,则认为该节点已访问。使用这种方法,您无需在算法完成后重置该值。

private static int NextGeneration = 0;

public int Generation { get; private set; }

public int GetClusterSize()
{
    return GetClusterSizeInternal(NextGeneration++);
}

private int GetClusterSizeInternal(int target)
{
    if (State != 1) return 0;

    Generation = target;

    int sum = 0;
    foreach (var neigh in InputNeigh)
    {
        if (neigh.State == 1 && neigh.Generation != target)
        {
            sum += neigh.GetClusterSizeInternal(target);
        }
    }

    return sum;
}

相同,但没有递归:

private static int NextGeneration = 0;

public int Generation { get; private set; }

public int GetClusterSize()
{
    if (State != 1) return 0;

    int target = NextGeneration++;

    Stack<Node> stack = new Stack<Node>();
    stack.Push(this);

    int count = 0;
    while (stack.Count > 0)
    {
        Node node = stack.Pop();
        if (node.Generation == target) continue;

        node.Generation = target;

        count++;
        foreach (var neigh in node.InputNeigh)
        {
            if (neigh.State == 1 && neigh.Generation != target)
            {
                stack.Push(neigh);
            }
        }
    }

    return count;
}
于 2013-09-30T21:21:05.667 回答
1

Rather than List you might consider System.Collections.BitArray

You might try something like this (pseudo code):

Stack<T> stk
stk.Push(node1)
grandparentnode = null
count = 0
if node1 state is 1 count++
while (node = stk.Pop())
    foreach connect in node.connections
        if the connect state is 1
            if connect != grandparentnode 
                stk.Push(connect)
                count++
    grandparentnode = node

I think that will work, but graphs are hard and my brain is very small and filled with errors :-(

Adding to post based on comment.

The grandparent node was a misguided attempt to eliminate the "Visited" field by maintaining a constant rolling grandparent/parent/child relationship. I'm a good programmer but perhaps don't have a mind for graph theory (which is why I'm drawn to such questions :-D) Anyway here is a complete program with revised code which I reached working with my original ideas. It depends on the grid like, double connected arrangement of your nodes and the idea of having a unique numerically increasing label for each. Those constraints might be too specific for your usage. I used a dictionary, but it should never contain more than 4 items. Created a custom class optimized for such a small collection would improve performance. This should remove the need to keep a 'visited' state, run quickly and not recurse. To find any subtree you need to start at the lowest label node of that tree.

It is of course also possible it arrived at the right answer out of luck :-) Worst case if gives anyone interested a full program skeleton to play with.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Xml.Linq;

namespace stackoverflow1 {
    class Program {
        class Node {
            public int Number;
            public int State = 0;
            public List<Node> Connects = new List<Node>();
            public Node(int num, int state) {
                Number = num;
                State = state;

                }
            }

        static void Main(string[] args) {
            var nodes = new List<Node>();
            nodes.Add(new Node(0, -1)); // not used
            nodes.Add(new Node(1, 1));
            nodes.Add(new Node(2, 1));
            nodes.Add(new Node(3, 1));
            nodes.Add(new Node(4, 0));
            nodes.Add(new Node(5, 1));
            nodes.Add(new Node(6, 1));
            nodes.Add(new Node(7, 0));
            nodes.Add(new Node(8, 0));
            nodes.Add(new Node(9, 0));
            nodes[1].Connects.Add(nodes[2]);
            nodes[1].Connects.Add(nodes[4]);

            nodes[2].Connects.Add(nodes[1]);
            nodes[2].Connects.Add(nodes[3]); 
            nodes[2].Connects.Add(nodes[5]);

            nodes[3].Connects.Add(nodes[2]); 
            nodes[3].Connects.Add(nodes[6]);

            nodes[4].Connects.Add(nodes[1]);
            nodes[4].Connects.Add(nodes[5]); 
            nodes[4].Connects.Add(nodes[7]);

            nodes[5].Connects.Add(nodes[2]);
            nodes[5].Connects.Add(nodes[4]); 
            nodes[5].Connects.Add(nodes[6]);
            nodes[5].Connects.Add(nodes[8]);

            nodes[6].Connects.Add(nodes[3]);
            nodes[6].Connects.Add(nodes[5]); 
            nodes[6].Connects.Add(nodes[9]);

            nodes[7].Connects.Add(nodes[4]);
            nodes[7].Connects.Add(nodes[8]); 

            nodes[8].Connects.Add(nodes[5]);
            nodes[8].Connects.Add(nodes[7]); 
            nodes[8].Connects.Add(nodes[9]);

            nodes[9].Connects.Add(nodes[6]);
            nodes[9].Connects.Add(nodes[8]); 

            var dict = new Dictionary<int, Node>();
            foreach (var n in nodes) {
                if (n.State == 1) {
                    dict.Add(n.Number, n);
                    break;
                    }
                }

            int count = dict.Count;
            while (dict.Count > 0) {
                foreach (var k in dict.Keys.ToArray()) { // retains node order
                    var n = dict[k]; // get the first node in number order
                    dict.Remove(k);
                    foreach (var node in n.Connects) { // look over it's connections/children
                        if ((node.State == 1) 
                        &&  (node.Number > n.Number))  {
                            if (dict.ContainsKey(node.Number) == false) {
                                // only add if this is has a greater number than the one
                                // being considered because lower values have already been
                                // processed
                                dict.Add(node.Number, node);
                                count++;
                                }
                            }
                        }
                    }
                }

            Console.WriteLine("Count = {0}", count);
            Console.ReadKey();
            }
        }
}
于 2013-09-30T22:06:49.203 回答