c++ - 在没有嵌套 for 循环的情况下查找另一个向量中出现的向量条目

Question

我有一段代码要从 Fortran 迁移到 C++，我想避免一些我必须在原始 F77 代码中创建的嵌套 for 循环结构。

问题是这样的：我有一个称为节点的对象向量，每个对象都包含一个向量，其中包含一个向量，其中包含每个连接到的其他节点对象的索引（以及其他重要信息）（连接图）。像这样

struct Node {
    vector<int> conNode;
};
vector<Node> listOfNodes;
vector<int> nodeListA;    // a subset of nodes of interest stored as their vector indices

我需要查找 nodeListA 中的节点连接到的节点，但前提是这些节点也在 nodeListA 中。现在，我的代码看起来像这样：

// Loop over the subset of node indices
for (int i=0; i<nodeListA.size(); i++) {
    // Loop over the nodes connected to the node i
    for (int j=0; j<listOfNodes[nodeListA[i]].conNode.size(); j++) {
        // Loop over the subset of node indices again
        for (int k=0; k<nodeListA.size(); k++) {
            // and determine if any of node i's connections are in the subset list
            if (nodeListA[k] == listOfNodes[nodeListA[i]].conNode[j]) {
               // do stuff here
            }
        }
    }
}

必须有一种更简单的方法来做到这一点。好像我把这种方式弄得太复杂了。如何简化此代码，可能使用标准算法库？

score 1 · Accepted Answer

如果您的变量应该表达一组值，请使用std::set而不是std::vector. 然后你会有

typedef std::set<int> SetOfIndices;
SetOfIndices setOfIndices; // instead of nodeListA
for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter)
{
    Node const & node = listOfNodes[*iter];
    for (int j = 0; j < node.conNode.size(); ++j)
    {
        if (setOfIndices.find(node.conNode[j]) != setOfIndices.end())
        {
            // do stuff here
        }
    }
}

编辑正如 Jerry Coffin 所建议的，std::set_intersection可以在外循环中使用：

struct Node {
    SetOfIndices conNode;
}
typedef std::set<int> SetOfIndices;
SetOfIndices setOfIndices; // instead of nodeListA
for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter)
{
    Node const & node = listOfNodes[*iter];
    std::vector<int> interestingNodes;

    std::set_intersection(setOfIndices.begin(), setOfIndices.end(),
                      node.conNode.begin(), node.conNode.end(),
                      std::back_inserter(interestingNodes));

    for (int j = 0; j < interestingNodes.size(); ++j)
    {
        // do stuff here
    }
}

另一个编辑
关于效率 - 这取决于主要操作是什么。被描述为“在这里做事”的部分的执行次数不会改变。不同之处在于遍历您的集合的时间：

您的原始代码 - nodeListA.size()^2 * [平均 conNode 大小]
我的第一个解决方案 - nodeListA.size() * log(nodeListA.size()) * [平均 conNode 大小]
根据 Jerry Coffin 的建议 - nodeListA.size()^2 * [有趣的 conNode 元素的平均数量]

因此，set_intersection在这种情况下，使用似乎没有帮助。

score 1 · Accepted Answer

我std::set建议std::unordered_set对nodeListA. 下面是一个 C++11 代码示例。

#include <unordered_set>
#include <vector>

struct Node {
  std::vector<int> conNode;
};

int main()
{
  std::vector<Node>       listOfNodes;
  std::unordered_set<int> nodeListA;

  for (int node_id : nodeListA)
    for (int connected_id : listOfNodes[node_id].conNode)
      if (nodeListA.find(connected_id) != end(nodeListA))
        /* Do stuff here.. */
          ;

  return 0;
}

使用 a 的优点std::unordered_set是查找（即搜索给定的节点 ID）非常快。但是，标准库中包含的实现可能不是特别快。谷歌的稀疏散列和密集散列实现是提供相同接口的替代方案，并且已知对大多数用途都非常好：http ://code.google.com/p/sparsehash/

根据您要对结果节点执行的操作，可以将上述代码的内部循环替换为 STL 算法。例如，如果您想将算法识别的所有节点放入一个向量中，您可以将其编码如下（将其用作两个循环的替换）：

std::vector<int> results;
for (int node_id : nodeListA)
  std::copy_if(begin(listOfNodes[node_id].conNode),
               end(listOfNodes[node_id].conNode),
               back_inserter(results),
               [&nodeListA](int id){return nodeListA.find(id) != end(nodeListA);});

同样，这是 C++11 语法；它使用 lambda 作为函数参数。

c++ - 在没有嵌套 for 循环的情况下查找另一个向量中出现的向量条目

2 回答 2

Related

Reference