c - 找出出现次数超过 N/3 的数

Question

有一个大小为N的随机数组，找出出现次数超过N/3的数？例如：</p>

{1,2,14,12,12,15,12,12,8} the result is 12

谁有更有效的算法？我是这样做的：</p>

int getNum(int *arr, int left, int right, const int size)
{
    srand(time(0));
    int index = rand()%(right - left + 1) + left;
    std::swap(arr[left], arr[index]);
    int flag = arr[left];
    int small = left;
    int big = right;
    int equal = left;
    while(equal <= big)
    {
        if(arr[equal] == flag)
        {
            equal++;
        }
        else if(arr[equal] < flag)
        {
            swap(arr[equal++], arr[small++]);
        }
        else
        {
            while(big > equal && arr[big] > flag)
            {
                big--;
            }
            std::swap(arr[big], arr[equal]);
            big--;
        }
    }
    if(equal - small >= (size / 3))
    {
        return arr[small];
    }
    if(small - left >= size/3)
    {
        return getNum(arr, left, small - 1, size);
    }
    if(right - equal + 1 >= size/3)
    {
        return getNum(arr, equal, right, size);
    }
    else
    {
        return -1;
    }
}

首先，我定义了三个大小相等和大的标志，选择一个数字作为标志，并找到这个数字的正确范围，当equal - small > size / 3，这就是我们找到的数字，否则找到大小超过的边size / 3并递归！

score 5 · Accepted Answer

实际上 - Karp-Papadimitriou-Shanker 提出了一种算法1/k，可以单次查找在数据中出现次数的项目。当然可以申请k=3。

然而，该算法给出了误报（说某些事情很频繁，尽管它不是） - 但是使用给定的 3 个候选者对数据进行第二次传递，这些可以很容易地消除。

算法如下：

PF = {}
for each element e:
  if pf.containsKey(e): 
     pf.put(e, pf.get(e)+1) //increase the value by 1
  else:
     pf.put(e,1)
     if pf.size() == k:
         for each key in pf:
              pf.put(key, pf.get(key)-1) //decrease all elements by 1
              if pf.get(key) == 0: //remove elements with value 0
                 pf.remove(key)
output pf

有关上述算法的更多信息和证明可以在此页面中找到，幻灯片 8-12

即使第二次通过，算法的复杂性也是O(n)（O(k)在您的情况下k==3）额外空间的时间。

score 1 · Accepted Answer

另一种（概率）算法 - 在数组中选择 50 个随机值。

选择此数组中出现次数最多的值并检查它是否符合原始数组中的条件（此操作是O(1)因为 50 是一个常数）。它将以 99% 的几率从第一次开始工作。但如果失败 - 从小（50 个元素）数组中获取第二个值并尝试它。继续这样。总体复杂性是O(n)，但如果可能没有符合原始数组中标准的值，则此方法需要修改。

score 0 · Accepted Answer

我的解决方案是对元素进行排序，如果索引 i+N/3-1 处的元素等于索引 i 处的元素，则该元素至少出现 N/3 次。

#include <stdio.h>

int compar(const void *a, const void *b) {
    return (*(int*)a) - (*(int*)b);
}

int main() {
    int N = 9;
    int N3 = N / 3;
    int tab[] = {1,2,14,12,12,15,12,12,8};

    qsort(tab, N, sizeof(int), compar);

    int i;
    for (i = 0; i <= N - N3; i++) {
        if (tab[i] == tab[i+N3-1]) {
            printf("%d\n", tab[i]);
        }
        while (tab[i] == tab[i+N3-1]) {
            i += N3 - 1;
        }
    }

    return 0;
}

复杂度是 O(n log n) （因为排序）。如果表已经排序，它是线性的。

score 0 · Accepted Answer

我第一次尝试解决这个问题是使用 HashMap。代码如下：

public int O_N_Memory_Solution(final List<Integer> a){
    int repetationCount = a.size()/3;
    HashMap<Integer,Integer> map = new HashMap<Integer,Integer>();
    for(int i = 0 ; i < a.size() ; i++){
        if(map.containsKey(a.get(i)))   map.put(a.get(i),map.get(a.get(i))+1);
        else    map.put(a.get(i),1);
        if(map.get(a.get(i))>repetationCount) return a.get(i);
    }
    return -1;
}

这是一个小代码，但会消耗更多的内存和时间。

我不知道有解决这个问题的算法，这是我对Karp-Papadimitriou-Shanker Algorithm 的实现。

该算法的主要思想是注意到从数组中删除 K 个不同的元素不会改变答案。

这里的 K 等于 3，我们试图在数组中找到出现次数超过 n/3 的任何元素。

// Karp-Papadimitriou-Shenker Algorithm
public int O_1_Memory_Solution(final List<Integer> a){
    if(a.size() == 0) return -1;

    int firstInt = 0, secondInt = 0;
    int firstCount = 0;
    int secondCount = 0;
    int current;

    for(int i = 0; i < a.size(); i++){

        current = a.get(i);

        // You should check 1st before setting so that, if one of the two integers is empty, 
        // you increment the non empty integer if the current matches it, not adding the current to
        // the empty one.

        if(current == firstInt && firstCount!=0) {
            firstCount++;
        } else if(current == secondInt && secondCount!=0) {
            secondCount++;
        } else if(firstCount == 0) {
            firstInt = current;
            firstCount = 1;
        } else if(secondCount == 0) {
            secondInt = current;
            secondCount = 1;
        } else {
            firstCount--;
            secondCount--;
        }

    }

    int repetationCount = a.size()/3;
    int[] candidates = {firstInt,secondInt};
    int ac;
    /* Check actual counts of potential candidates */
    for (int i = 0; i < candidates.length; i++) {
        // Calculate actual count of elements 
        ac = 0;  // actual count
        for (int j = 0; j < a.size(); j++)
            if (a.get(j) == candidates[i])
                ac++;

        // If actual count is more than n/k, then print it
        if (ac > repetationCount) return candidates[i];
    }

    return -1;
}

c - 找出出现次数超过 N/3 的数

4 回答 4

Related

Reference