2

假设我有两组数据,每个条目都包含一个权重。每组按权重升序排列。我将列出几个示例集:

Set 0: {4, 8, 19, 25, 25, 26}    
Set 1: {1, 2, 3, 8, 20, 27}

我想找到这两组之间的每一对,但我想按照它们的总重量从小到大的顺序找到这些对。所以 4+1、4+2、4+3、8+1 等。你可以假设这些集合是 c++ std::multiset,但除此之外,我认为它在任何语言中都是一样的。

现在,我考虑过使用 2 个迭代器并从第一个迭代器开始迭代,然后按顺序每秒钟配对一次,等等,但这不会从最小的总权重开始按顺序计算每一对,因为在这个例子中是Set 0: 4+ Set 1: 8> Set 0: 8+ 。Set 1: 1我总是可以将结果转储到为我进行排序的容器中,但这似乎效率低下。我还有其他的优化取决于能否做到这一点。有没有一种聪明的方法可以在最后没有额外的排序?

编辑:我需要得到每一个可能的配对,所以它不像增加一个迭代器或另一个迭代器来获得最小的和那么简单。这将错过大多数(一半?)对。虽然可能有某种迭代器堆栈......

4

2 回答 2

2

Let us denote the 2 sorted lists A (size m) and B (size n).

The algorithm:

  1. Calculate the sum of A[0] and B[i] for all i = 0 to n - 1. You need to identify which element from list A the sum includes - one way is to attach the index (which is 0 for all sums here). Push all the pairs into a min-heap which is ordered by the sum.
  2. Pop the heap, and take out the sum. Use the index attached to calculate the sum with the next element in list A. If the index is m - 1 already, no need to do anything; otherwise increment the index and push it back into the heap.
  3. Repeat step 2 until the heap is empty (for all m * n values).

Step 1 will be O(n log n).

Step 2 is at most O(log n) since the size of the heap may decrease and never increase. Repeating step 2 by m * n times result in time complexity of O(mn log n).

Overall complexity is O(mn log n).

By using the smaller list as list B in the algorithm above, we can achieve a slightly better time complexity (we only need to manage a small heap instead of a big heap).

An implementation using std::priority_queue (by Stacker):

#include <iostream>
#include <set>
#include <queue>
#include <limits>

std::multiset<int> set1 {4, 8, 19, 25, 25, 26};
std::multiset<int> set2 {1, 2, 3, 8, 20, 27};

struct Node
{
  Node(std::multiset<int>::const_iterator set1Iterator, int set2Value) :
    set1Iterator(set1Iterator),
    set2Value(set2Value),
    sum(*set1Iterator + set2Value)
  {
  }

  bool operator < (const Node& other) const
  {
    return sum > other.sum; // Reverse order as std::priority_queue sorts for the greatest value
  }

  std::multiset<int>::const_iterator set1Iterator;
  int set2Value;
  int sum;
};

int main()
{
  std::priority_queue<Node> heap;
  for (auto i = set2.begin(); i != set2.end(); ++i)
    heap.push(Node(set1.begin(), *i));

  while (!heap.empty())
  {
    Node min(heap.top());
    heap.pop();

    std::cout << *min.set1Iterator << " + " << min.set2Value << " = " <<
      min.sum << std::endl;
    if (++min.set1Iterator != set1.end())
    {
      min.sum = *min.set1Iterator + min.set2Value;
      heap.push(min);
    }
  }
  return 0;
}
于 2012-10-12T19:51:52.613 回答
2
#include <iostream>
#include <set>
#include <vector>
#include <limits>

std::multiset<int> set1 {4, 8, 19, 25, 25, 26};
std::multiset<int> set2 {1, 2, 3, 8, 20, 27};

int main()
{
  std::vector<std::pair<std::multiset<int>::const_iterator,
    std::multiset<int>::const_iterator>> setIterators;
  for (auto i = set1.begin(); i != set1.end(); ++i)
    setIterators.push_back(std::make_pair(i, set2.begin()));

  for (std::size_t count = set1.size() * set2.size(); count != 0; --count)
  {
    int minValue = std::numeric_limits<int>::max();
    auto minElement = setIterators.begin();
    for (auto i = setIterators.begin(); i != setIterators.end(); ++i)
    {
      if (i->second == set2.end())
        continue;
      int sum = *i->first + *i->second;
      if (sum < minValue)
      {
        minValue = sum;
        minElement = i;
      }
    }
    std::cout << *minElement->first << " + " << *minElement->second << " = " <<
      minValue << std::endl;
    ++minElement->second;
  }
  return 0;
}

输出:

$ g++ -std=c++11 main.cpp -o main && ./main
4 + 1 = 5
4 + 2 = 6
4 + 3 = 7
8 + 1 = 9
8 + 2 = 10
8 + 3 = 11
4 + 8 = 12
...
26 + 27 = 53
于 2012-10-12T19:30:24.873 回答