algorithm - 相关网络实现

Question

我一直在研究我的图形/网络问题，我想我终于知道我想做什么了。现在我正在进入实现，我在决定使用哪些库时遇到了问题。该图本身非常简单，每个节点都由一个字符串标记，每个节点都是两个节点（变量）之间的概率/相关系数，并且是无向的。我想在图上执行的操作是：

插入新节点/边（快速）
找到所有对最短（1/概率）路径，并记住路径中的节点 - 可能是约翰逊的算法
为 k 个特定顶点构造最小权重 Steiner 树
- 使用约翰逊算法构建最短路径
- 遍历路径 p 中的当前节点，找到到 k 中剩余节点的最短路径
查看图表的平均度数
评估节点的介数
获取聚类系数
找到图的模块化

对于其中的许多，我想将结果与 Erdos-Renyi 模型进行比较，并将其作为零假设进行测试。此外，能够通过马尔可夫场使用统计力学定义会很有帮助，因为这样我就可以计算两个不相同的节点之间的相关性，并询问关于熵等的图形问题。所以一个好的映射到某种马尔可夫字段库上也会很有用。

目前问题的症结在于我正在尝试找到一个可以使用的 C++ 库。我看过 R，但我想要一些更健壮、更快的东西。我正在考虑的三个库是：

柠檬
- 易于使用和安装
- 简单明了的文档
- 有一些我想要的功能
- 通过读取文本文件动态创建图形，并确保没有重复的节点，这是我无法弄清楚的噩梦
Boost 图形库
- 对象的难以处理、晦涩难懂的定义，以及如何使用它们
- 文档与代码的功能不匹配，必然
- 有很多我想要的算法，以及从文本文件创建图形的非常简单的方法
多线程图形库
- 并行性已经合并
- 比 BGL 更容易阅读
- 没有那么多功能
- 仍然是神秘的

再往前走，我设想图生活在分布式网络上，具有分布式存储（hadoop 或其他东西）。我怀疑整个图不适合内存，所以我必须想出一个缓存方案来查看图的某些部分。

对于我描述的问题，人们会建议什么图书馆？只使用 BGL 并编写我自己的函数会更好吗？多线程版本呢？是否有任何图书馆更适合我想做的工作类型，尤其是我想要计算的数量？

原帖

谢谢！

Edit1 所以我对 BGL 感到非常沮丧。我有一个邻接列表图，我想在图上运行我自己版本的约翰逊（或弗洛伊德，在这一点上，我并不挑剔），并返回距离矩阵供我查看。除了我无法让它工作。到目前为止，这是我的完整代码实现：

using namespace boost;

int main()
{
//Read in the file
std::ifstream datafile("stuff");
if (!datafile)
{
    std::cerr << "No Stuff file" << std::endl;
    return EXIT_FAILURE;
}

//Build the graph
typedef adjacency_list < vecS, vecS, undirectedS, property < vertex_name_t,
    std::string >, property < edge_weight_t, double > > Graph;
Graph g;

//Build the two properties we want, string and double
//Note, you have to nest properties for more
typedef property_map< Graph, vertex_index_t >::type vertex_index_map_t;
vertex_index_map_t vertex_index_map = get(vertex_index, g);
typedef property_map < Graph, vertex_name_t >::type name_map_t;
name_map_t name_map = get(vertex_name, g);
typedef property_map < Graph, edge_weight_t >::type probability_map_t;
probability_map_t probability = get(edge_weight, g);

//Map of of the vertices by string
typedef graph_traits < Graph >::vertex_descriptor Vertex;
typedef std::map < std::string, Vertex > NameVertexMap;
NameVertexMap AllNodes;

//Load the file into the graph
for (std::string line; std::getline(datafile, line);)
{
    char_delimiters_separator < char >sep(false, "", ";");
    tokenizer <> line_toks(line, sep);
    tokenizer <>::iterator i = line_toks.begin();
    std::string conditionA = *i++;
    NameVertexMap::iterator pos;
    bool inserted;
    Vertex u, v;

    boost::tie(pos, inserted) = AllNodes.insert(std::make_pair(conditionA, Vertex()));
    if (inserted)
    {
        u = add_vertex(g);
        name_map[u] = conditionA;
        pos->second = u;
    }
    else
    {
        u = pos->second;
    }

    std::string correlation = *i++;
    std::istringstream incorrelation(correlation);
    double correlate;
    incorrelation >> correlate;

    boost::tie(pos, inserted) = AllNodes.insert(std::make_pair(*i, Vertex()));
    if (inserted) {
        v = add_vertex(g);
        name_map[v] = *i;
        pos->second = v;
    }
    else
    {
        v = pos->second;
    }

    graph_traits < Graph >::edge_descriptor e;
    boost::tie(e, inserted) = add_edge(u, v, g);
    if (inserted)
        probability[e] = 1.0/correlate;
}

typedef boost::graph_traits<Graph>::edge_iterator edge_iter;
std::pair<edge_iter, edge_iter> edgePair;
Vertex u, v;
for(edgePair = edges(g); edgePair.first != edgePair.second; ++edgePair.first)
{
    u = source(*edgePair.first, g);
    v = target(*edgePair.first, g);
    std::cout << "( " << vertex_index_map[u] << ":" << name_map[u] << ", ";
    std::cout << probability[*edgePair.first] << ", ";
    std::cout << vertex_index_map[v] << ":" << name_map[v] << " )" << std::endl;
}  
}

其中输入文件的格式为 NodeA;correlation;NodeB。我在上面粘贴的代码有效，但是当我尝试包含 johnson_all_pairs_shortest_paths 功能时遇到了严重的麻烦。我真正想要的不仅是一个 DistanceMatrix D（我似乎无法正确构造它，我希望它是一个双精度方阵 double D[V][V], V = num_vertices(g)，但它给了我回我没有正确调用该函数），还有沿着该路径采用的节点列表，类似于 wiki 文章中关于Floyd 算法的内容路径重构。我是否应该尝试为这个问题推出自己的算法，因为我无法确定功能是否存在（更不用说如何进行函数调用）？BGL 的文档与实现一样迟钝，所以我真的没有任何现代示例可以继续。

algorithm - 相关网络实现

0 回答 0

Related

Reference