algorithm - 查找数组中的多数元素

Question

我想在这里讨论一个我在数据结构书中找到的算法。本书介绍了在大小为 N 的数组中查找多数元素（出现大于 N/2 ）的算法草图。算法示意图如下：

首先，找到候选多数元素（这是较难的部分）。这个候选人是唯一可能成为多数人的元素。第二步确定这个候选人是否真的是多数人。这只是对数组的顺序搜索。要在数组 A 中找到候选者，请形成第二个数组 B。然后比较 A1,A2。如果它们相等，则将其中之一添加到 B；否则什么也不做。然后比较 A3 和 A4。同样，如果它们相等，则将其中之一添加到 B；否则什么也不做。以这种方式继续，直到读取整个数组。然后递归寻找B的候选者；这是A的候选人。

我发现如果 N 是偶数，算法工作正常。但是如果 N 是奇数呢？我们该如何处理这种情况？

score 4 · Accepted Answer

多数元素：

大小为 n 的数组 A[] 中的多数元素是出现超过 n/2 次的元素（因此最多有一个这样的元素）。

寻找候选人：

在 O(n) 中工作的第一阶段算法称为摩尔投票算法。该算法的基本思想是，如果我们用与 e 不同的所有其他元素来抵消元素 e 的每次出现，那么如果 e 是多数元素，则 e 将一直存在到结束。

findCandidate(a[], size)
1.  Initialize index and count of majority element
     maj_index = 0, count = 1
2.  Loop for i = 1 to size – 1
    (a)If a[maj_index] == a[i]
        count++
    (b)Else
        count--;
    (c)If count == 0
        maj_index = i;
        count = 1
3.  Return a[maj_index]

上面的算法循环遍历每个元素并维护a[maj_index]的计数，如果下一个元素相同则增加计数，如果下一个元素不相同则减少计数，如果计数达到0则将maj_index更改为当前元素并将计数设置为 1。第一阶段算法为我们提供了一个候选元素。在第二阶段，我们需要检查候选人是否真的是多数派。

第二阶段很简单，可以在 O(n) 中轻松完成。我们只需要检查候选元素的计数是否大于 n/2。

score 1 · Accepted Answer

大小为 n 的数组 A[] 中的多数元素是出现超过 n/2 次的元素

 int find(int[] arr, int size) { 
  int count = 0, i, mElement;
 for (i = 0; i < size; i++) {
    if (count == 0)
        mElement = arr[i];
    if (arr[i] == mElement) 
        count++;
    else
        count--;
  }
 count = 0;
 for (i = 0; i < size; i++)
    if (arr[i] == mElement)
         count++;
  if (count > size/2)
        return mElement;
  return -1;
 }

score 0 · Accepted Answer

Here are three algorithms that solve the problem:

1) Scan the elements of the array, accumulating a frequency of each distinct element using some sort of dictionary (hash table, balanced tree). When the scan is complete, pick the only dictionary item with a count greater than n/2, or report that no element has a majority.

2) The majority element must be the median of the array if it exists. Compute the median using the median-of-five technique or any other algorithm, and confirm that it is greater than n/2.

3) An algorithm of Boyer and Moore (the same guys that did the string-matching algorithm) picks the first array element as the candidate item and assigns a count of 1, then scans the array, adding 1 to the count if the next item is the same as the current candidate, subtracts 1 from the count if the next item differs from the current candidate, and resets the candidate to the next array element, with a count of 1, whenever the count reaches 0. At the end of the scan the current candidate is a member of the majority if a majority exists, so a second scan is required to confirm that the candidate does form a majority.

I discuss and implement all three algorithms at my blog.

score 0 · Accepted Answer

您解释的算法适用于以下想法：

如果两个连续元素相等，则这是潜在的多数元素并将其包含在结果集中（用于进一步处理）
如果两个元素是不同的，那么这对特定的元素不会决定获胜者。

第 2 点至关重要 - 假设 (A1, A2) 对包含多数元素 (A1) 和另一个少数元素 (A2)。删除这对。A1 在阵列的其余部分中仍然占多数。

如果你理解了这个解释，那么你就会明白为什么当元素不同时什么都没有添加到 B 中。

对于包含奇数个元素的数组，该算法的推论是：向 B 添加一个奇数元素。原因与 #1 相同：它可能是多数元素，它应该包含在输出中只是为了确保其计数不会在此过程中丢失。

无论如何，整个事情可以在没有另一个数组 B 的情况下完成。在这里你可以找到更简单的算法：Finding a Majority Element in an Array

score -1 · Accepted Answer

我认为当 N 也是如此时，您的算法将不起作用。我可以用一个例子来证明：

考虑 8 个元素的数组，例如 2,2,1,1,3,2,2,2。现在，根据您的声明，您的 B 数组将是 2,1,2。如果再次重复上述步骤。不会有任何候选人。但这里的实际答案是“2”。所以这个方法肯定是错误的。

这个问题在 Geeks for Geeks 上讨论过。我认为这会帮助你：

http://www.geeksforgeeks.org/majority-element/

algorithm - 查找数组中的多数元素

5 回答 5

Related

Reference