c++ - 两个排序数组的交集

Question

问题> 给定两个排序数组 A 和 B，返回一个数组 C，其中包含 A 和 B 共有的元素。数组 C 不能包含重复项。

这是我的解决方案，但我的直觉是它是错误的。但我找不到反例来反对它。有人可以为我指出吗？或者给我一个反例？

更新：

该算法的工作原理如下：

我们持有一个指向每个数组的指针并将这些指针向前移动，直到找到一个公共元素。然后，如果公共元素不在 C 中，则将找到的元素存储在 C 中。否则，依赖于元素，我们相应地向前移动指针。

#include <iostream>
#include <vector>
#include <random>
#include <iterator>
#include <algorithm>
using namespace std;

vector<int> Intersect(const vector<int>& vecIntsA, const vector<int>& vecIntB)
{
    int indA = 0;
    int indB = 0;
    vector<int> vecIntC;

    while(indA < vecIntsA.size() && indB < vecIntB.size())
    {
        if (vecIntsA[indA] == vecIntB[indB]) {
            if ( vecIntC.empty() || vecIntC.back() != vecIntsA[indA])
                vecIntC.emplace_back(vecIntsA[indA]);
            indA++;
            indB++;
        } else if (vecIntsA[indA] < vecIntB[indB]) 
            indA++;
        else // (vecIntsA[indA] > vecIntB[indB]) 
            indB++;        
    }

    return vecIntC;
}

int main()
{
   default_random_engine dre;
   uniform_int_distribution<int> dist(0, 100);

   vector<int> vecIntA;
   for(int i=0; i < 20; ++i)
    vecIntA.emplace_back(dist(dre));   
   sort(vecIntA.begin(), vecIntA.end());
   copy(vecIntA.cbegin(), vecIntA.cend(), ostream_iterator<int>(cout, ","));
   cout << endl;

   vector<int> vecIntB;
   for(int i=0; i < 24; ++i)
    vecIntB.emplace_back(dist(dre));   
   sort(vecIntB.begin(), vecIntB.end());
   copy(vecIntB.cbegin(), vecIntB.cend(), ostream_iterator<int>(cout, ","));
   cout << endl;

   vector<int> vecIntC = Intersect(vecIntA, vecIntB);
   copy(vecIntC.cbegin(), vecIntC.cend(), ostream_iterator<int>(cout, ","));

   return 0;
}

score 1 · Accepted Answer

1

您总是可以使用 STL 算法 set_intersection 和 unique？

于 2013-10-09T04:49:47.730 回答

score 1 · Accepted Answer

这是一个具有时间复杂度 (p+q) 的快速解决方案，其中 p 和 q 分别是数组 A 和 B 的长度。

#include <iostream>
#include <vector>
#include <set>
#include <algorithm>
using namespace std;

set<int> intersect(vector<int> &A, vector<int> &B) {
    int j = 0;
    vector<int> V;
    for(int i = 0;i<A.size();i++){
        first:
        if(j == B.size()) break;
        if(A[i] == B[j]){
            V.push_back(A[i]);
            j++;
        }
        else if(A[i]>B[j]) { j++;goto first;}
    }
    set<int> S(V.begin(), V.end());
    return S;
}

int main() {
    vector<int> A,B;
    A = {1,2,3,3,4,5,6};
    B = {3,3,5,6};
    set<int> S;
    S = intersect(A,B);     
    set<int>::iterator iter;
    for(iter=S.begin(); iter!=S.end();++iter){
        cout<<(*iter)<<" ";
    }

    return 0;
}

这是一个2-pointer解决方案。当其他循环向前移动时，尝试在其中一个循环中寻找单调性。如果你发现了，你已经找到了你的优化。快乐编码:)

score 0 · Accepted Answer

你的算法看起来很合理。值得一提的是，我最近解决了完全相同的问题，并针对两个数组长度相似的情况提出了类似的算法。

一般来说，如果您想支持您的算法产生好的解决方案的信念，请以您可以自动检查的方式表达好的解决方案的基本属性。然后针对这些属性编写自动化测试。（这是QuickCheck推广的一种很棒的测试方式。）

例如，对于这个问题，我将数组交集的基本属性表示如下：“给定一个交集函数f，对于所有排序的数组A，B，f(A, B) == sorted(set(A) & set(B))。” （在 Python 中，set(xs)从的元素生成一个集合，应用于集合xs的&运算符计算交集。）本质上，我将所需的数组交集语义映射到 Python 的内置语义中，用于排序和集合交集。这样，我可以用廉价、现成的部件为我的实现构建一个正确性预言机。最后一步是构建随机测试用例并检查映射是否适用于所有测试用例（通过咨询 oracle）。

下面是对应的代码：

def check_function(f):
# fundamental property:
# forall sorted arrays A, B. intersect(A, B) == sorted(set(A) & set(B))
from math import factorial
from random import randrange
from nose.tools import assert_equal
for N in xrange(8):
    for _ in xrange(factorial(N)):  # get decent sample of problem space
        m, n = randrange(N + 1), randrange(N + 1)
        A = sorted(randrange(N + 1) for _ in xrange(m))
        B = sorted(randrange(N + 1) for _ in xrange(n))
        got = f(A, B)
        expected = sorted(set(A) & set(B))
        assert_equal(got, expected)

c++ - 两个排序数组的交集

3 回答 3

Related

Reference