java - 查找两个未排序数组中的共同元素

Question

我试图找到解决这个问题的方法：我有两个整数数组 A 和 B（A 和 B 可以有不同的维度）。我必须在这两个数组中找到共同的元素。我还有一个条件：公共元素之间的最大距离是k。所以，这是我的解决方案。我认为是正确的：

for (int i = 0; i<A.length; i++){
    for (int j=jlimit; (j<B.length) && (j <= ks); j++){
        if(A[i]==B[j]){
            System.out.println(B[j]);
            jlimit = j;
            ks = j+k;
        }//end if
    }
}

有没有办法做出更好的解决方案？有什么建议么？提前致谢！

score 5 · Accepted Answer

实现二进制搜索和快速排序！

这将导致大量代码....但最快的结果。

您可以使用类似快速排序的方式对较大数组的元素进行排序，这将导致 O(nlogn)。

然后遍历每个值的较小数组，并对另一个数组中的特定元素进行二进制搜索。为二分搜索中的距离添加一些逻辑。

我认为您可以将复杂性降低到 O(nlogn)。最坏情况 O(n^2)

伪代码。

larger array equals a
other array equals b

sort a

iterate through b
       binary search b at iterated index
     // I would throw (last index - index) logic in binary search
     // to exit out of that even faster by returning "NOT FOUND" as soon as that is hit.
       if found && (last index - index) is less than or equal 
          store last index
          print value

我相信这是解决您的问题的最快方法。

score 5 · Accepted Answer

鉴于您的解释，我认为最直接的方法是读取数组A，将所有元素放入a Set（setA）中，对B（setB）做同样的事情，并使用该retainAll方法找到两个集合的交集（属于两者的项目的集合）。

您会看到k distance根本没有使用，但我看不出有办法使用导致代码更快或更易于维护的条件。我提倡的解决方案在不强制执行该条件的情况下有效，因此在条件为真时也有效（这称为“弱化先决条件”）

score 2 · Accepted Answer

虽然这是个骗子，但因为它使用了HashSets，所以对于这个算法的 Java 实现来说是相当不错的。如果您需要算法的伪代码，请不要进一步阅读。

JavaDoc 中的来源和作者。干杯。

/**
 * @author Crunchify.com
 */
public class CrunchifyIntersection {

    public static void main(String[] args) {
         Integer[ ] arrayOne = { 1, 4, 5, 2, 7, 3, 9 };
         Integer[ ] arrayTwo = { 5, 2, 4, 9, 5 };

         Integer[ ] common = iCrunchIntersection.findCommon( arrayOne, arrayTwo );

         System.out.print( "Common Elements Between Two Arrays: " );       
         for( Integer entry : common ) {
              System.out.print( entry + " " );
         }
   }

   public static Integer[ ] findCommon( Integer[ ] arrayOne, Integer[ ] arrayTwo ) {

        Integer[ ] arrayToHash;
        Integer[ ] arrayToSearch;

        if( arrayOne.length < arrayTwo.length ) {
            arrayToHash = arrayOne;
            arrayToSearch = arrayTwo;
        } else {
            arrayToHash = arrayTwo;
            arrayToSearch = arrayOne;
        }

        HashSet<Integer> intersection = new HashSet<Integer>( );

        HashSet<Integer> hashedArray = new HashSet<Integer>( );
        for( Integer entry : arrayToHash ) {
            hashedArray.add( entry );
        }

        for( Integer entry : arrayToSearch ) {
            if( hashedArray.contains( entry ) ) {
                intersection.add( entry );
            }
        }

        return intersection.toArray( new Integer[ 0 ] );
    }
 }

score 2 · Accepted Answer

您的实现大致为 O(A.length*2k)。

如果您想保持“不超过 k 远”的逻辑，这似乎是您要做的最好的事情，因为这排除了排序和使用集合。我会稍作改动以使您的代码更易于理解。

首先，我会确保您遍历两个数组中较小的一个。这将使复杂度 O(min(A.length, B.length)*2k)。

为了理解这样做的目的，考虑A有 1 个元素和B100 个元素的情况。在这种情况下，我们只在外循环中执行一次迭代，在内循环中执行 k 次迭代。

现在考虑 whenA有 100 个元素，并且B有 1 个。在这种情况下，我们将在外循环上执行 100 次迭代，在内循环上每次执行 1 次迭代。

如果 k 小于长数组的长度，则在外循环中迭代较短的数组会更有效。
然后，为了便于阅读，我会改变你计算 k 距离的方式。我写的代码证明了这一点。

这是我要做的：

//not sure what type of array we're dealing with here, so I'll assume int.
int[] toIterate;
int[] toSearch;

if (A.length > B.length)
{
    toIterate = B;
    toSearch = A;
}
else
{
    toIterate = A;
    toSearch = B;
}

for (int i = 0; i < toIterate.length; i++)
{
    // set j to k away in the negative direction
    int j = i - k;

    if (j < 0) 
        j = 0;

    // only iterate until j is k past i
    for (; (j < toSearch.length) && (j <= i + k); j++)
    {
        if(toIterate[i] == toSearch[j])
        {
            System.out.println(toSearch[j]);
        }
    }
}

您使用jlimitandks可能会起作用，但是像这样处理您的 k 距离对于您的普通程序员来说更容易理解（而且效率略高）。

score 1 · Accepted Answer

O(N) 解决方案（BloomFilters）：

这是一个使用布隆过滤器的解决方案（实现来自 Guava 库）

public static <T> T findCommon_BloomFilterImpl(T[] A, T[] B, Funnel<T> funnel) {
    BloomFilter<T> filter = BloomFilter.create(funnel, A.length + B.length);
    for (T t : A) {
        filter.put(t);
    }
    for (T t : B) {
        if (filter.mightContain(t)) {
            return t;
        }
    }
    return null;
}

像这样使用它：

    Integer j = Masking.findCommon_BloomFilterImpl(new Integer[]{12, 2, 3, 4, 5222, 622, 71, 81, 91, 10}, new Integer[]{11, 100, 15, 18, 79, 10}, Funnels.integerFunnel());
    Assert.assertNotNull(j);
    Assert.assertEquals(10, j.intValue());

在 O(N) 中运行，因为计算 Integer 的哈希非常简单。如果您可以将元素的哈希计算减少到 O(1) 或小的 O(K)，那么仍然是 O(N)，其中 K 是每个元素的大小。

O(N.LogN) 解决方案（排序和迭代）：

排序和遍历数组将引导您获得 O(N*log(N)) 解决方案：

public static <T extends Comparable<T>> T findCommon(T[] A, T[] B, Class<T> clazz) {
    T[] array = concatArrays(A, B, clazz);
    Arrays.sort(array);
    for (int i = 1; i < array.length; i++) {
        if (array[i - 1].equals(array[i])) {     //put your own equality check here
            return array[i];
        }
    }
    return null;
}

concatArrays(~)当然是 O(N)。Arrays.sort(~)是 QuickSort 的双轴实现，复杂度为 O(N.logN)，再次遍历数组是 O(N)。

所以我们有 O((N+2).logN) ~> O(N.logN)。

作为一般情况下的解决方案（没有问题的“在 k 内”条件）比你的要好。在您的精确情况下，应该考虑 k “接近” N。

score 1 · Accepted Answer

如果数组已经排序，则简单的解决方案

 public static void get_common_courses(Integer[] courses1, Integer[] courses2) {
        // Sort both arrays if input is not sorted 
        //Arrays.sort(courses1);
        //Arrays.sort(courses2);
        int i=0, j=0;
        while(i<courses1.length && j<courses2.length) {
            if(courses1[i] > courses2[j]) {
                j++;
            } else if(courses1[i] < courses2[j]){
                i++;
            } else {
                System.out.println(courses1[i]);
                i++;j++;
            }
        }
}

Apache commons collections API 以高效的方式完成了这项工作，无需排序

    public static Collection intersection(final Collection a, final Collection b) {
    ArrayList list = new ArrayList();
    Map mapa = getCardinalityMap(a);
    Map mapb = getCardinalityMap(b);
    Set elts = new HashSet(a);
    elts.addAll(b);
    Iterator it = elts.iterator();
    while(it.hasNext()) {
        Object obj = it.next();
        for(int i=0,m=Math.min(getFreq(obj,mapa),getFreq(obj,mapb));i<m;i++) {
            list.add(obj);
        }
    }
    return list;
}

score 1 · Accepted Answer

使用 Java 8 的解决方案

static <T> Collection<T> intersection(Collection<T> c1, Collection<T> c2) {
    if (c1.size() < c2.size())
        return intersection(c2, c1);
    Set<T> c2set = new HashSet<>(c2);
    return c1.stream().filter(c2set::contains).distinct().collect(Collectors.toSet());
}

使用 Arrays::asList 和原语的装箱值：

Integer[] a =...    
Collection<Integer> res = intersection(Arrays.asList(a),Arrays.asList(b));

score 0 · Accepted Answer

通用解决方案

public static void main(String[] args) {
    String[] a = { "a", "b" };
    String[] b = { "c", "b" };
    String[] intersection = intersection(a, b, a[0].getClass());
    System.out.println(Arrays.toString(intersection));
    Integer[] aa = { 1, 3, 4, 2 };
    Integer[] bb = { 1, 19, 4, 5 };
    Integer[] intersectionaabb = intersection(aa, bb, aa[0].getClass());
    System.out.println(Arrays.toString(intersectionaabb));
}

@SuppressWarnings("unchecked")
private static <T> T[] intersection(T[] a, T[] b, Class<? extends T> c) {
    HashSet<T> s = new HashSet<>(Arrays.asList(a));
    s.retainAll(Arrays.asList(b));
    return s.toArray((T[]) Array.newInstance(c, s.size()));
}

输出

[b]
[1, 4]

java - 查找两个未排序数组中的共同元素

8 回答 8

Related

Reference