java - 用于计算活动呼叫的更快算法

Question

我们正在为呼叫中心实施密度报告。结果必须显示为表格，每天有一行显示当天同时活动的最大呼叫数。

我们正在构建 UI 背后的库。合约指定我们接收当天的通话次数和两个整数数组，一个是开始时间，一个是每次通话的结束时间，例如：

对于给定的一天，只收到两个电话：一个从时间 20 到 30，另一个从 10 到 20。同时呼叫的最大数量是 1。

另一方面，再过一天，也接到两个电话，一个从 10 到 45，另一个从 15 到 40，那么同时呼叫的最大数量是 2。

Web服务的合同是这样的

public static int GetMaxDensity(int N, int[] X, int[] Y)

数据看起来像这样（假设当天接到 3 个电话）。第一个从 10 到 25，第二个从 12 到 30，第三个从 20 到 23。

N = 3, 
X = {10, 12, 20}
Y = {25, 30, 23}

回报必须是：3。

我已经实现了这个解决方案：

public static int GetMaxDensity(int N, int[] X, int[] Y) 
{
  int result = 0;
  for (int i = 0; i < N; i++) 
  {
      int count = 0, t = X[i];
      for (int j = 0; j < N; j++) 
      {
        if (X[j] <= t && t < Y[j])
        count++;
      }
      result = Math.max(count, result);
   }
   return result;
}

当呼叫数量高达 1000（周末）但在工作日内数量很大并且计算时间很长（> 5 分钟）时，它的效果很好。我现在的原因可能是我的解决方案是使用两个嵌套循环，但我对复杂算法没有太多经验，所以我的问题是：

鉴于我只需要同时调用的最大数量（不是时间也不是调用者），如果有一个，这可能是执行此计算的更快方法。

score 5 · Accepted Answer

随着 N 的增长，您的时间会迅速增长 (N*N)。一个简单的解决方案（如果您的时间间隔为午夜后的几分钟）是创建一个包含 1440 个整数的数组，其中包含一天中每一分钟的呼叫计数变化。然后您可以从 0 到 N-1 循环一次，并且对于每个元素，通过在调用开始时增加值并在结束时减少值来调整该时间点的调用计数增量的计数。之后，只需查看计数以获得最大值。对于较大的 N 值，这应该快得多。

由于 1440 是一个常数（对于最后一步），并且不需要对输入进行排序，因此这应该具有线性时间复杂度。该算法的运行时间不受平均调用长度的影响。

public static int GetMaxDensity(int N, int[] X, int[] Y) {
    int rangeStart = Integer.MAX_VALUE;
    int rangeEnd = Integer.MIN_VALUE;
    for(int i=0; i<N; i++) {
        if (X[i] < rangeStart) rangeStart = X[i];
        if (Y[i] > rangeEnd) rangeEnd = Y[i];
    } 
    int rangeSize = rangeEnd - rangeStart + 1;
    int[] histogram = new int[rangeSize];
    for (int t = 0; t < rangeSize; t++) histogram[t] = 0;
    for (int i = 0; i < N; i++) {
        histogram[X[i]-rangeStart]++;
        histogram[Y[i]-rangeStart]--;
    }
    int maxCount = 0;
    int count = 0;
    for (int t = 0; t < rangeSize; t++) {
        count += histogram[t];
        if (count > maxCount) maxCount = count;
    }
    return maxCount;        
}

相比之下，N=50,000 且随机调用长度在 1 到 40 分钟之间，问题中的算法使用了 29,043 毫秒，而该算法使用了 8 毫秒。我在 c# 中运行了这些测试，但它们应该与 Java 产生的结果相当。

score 2 · Accepted Answer

请允许我提出一个不同的算法。鉴于每天最多有 24*60 = 1440 分钟，为什么不制作一个直方图数组来计算每分钟同时调用的数量。

public static int GetMaxDensity(int N, int[] X, int[] Y) 
{
  int[] h = new int[1440];
  // loop through all calls
  for (int i=0; i<N ; i++){
    addIt(X[i], Y[i], h);
  }

  // find max
  int m = 0;
  for(int i =0 ; i<1440; i++){
    if (h[i]>m)
      m = h[i];
  }
  return m;
}

// counting for one call
public static void addIt(int x, int y, int[] h){
  for ( int i=x;i<y;i++){
    h[i]++;
  }
}

复杂度为 O(m*n)，其中 m 是调用的平均长度。由于调用的数量可能远远超过 1000，所以运气好的话，这个算法在实践中会更快。

score 1 · Accepted Answer

按开始时间对所有呼叫进行排序。遍历列表并保留按结束时间排序的“活动呼叫”列表。应该看起来像这样：

public class DensityReport {

  static int count;

  static class Call {
    public Call(int x, int y) {
      double f = 0.1/(++count);
      start = x + f;
      end = y + f;
    }
    double start;
    double end;
  }

  public static int getMaxDensity(int n, int[] x, int[] y) {
    // Calls sorted by start time
    TreeSet<Call> calls = new TreeSet<Call>(new Comparator<Call>() {
      public int compare(Call c1, Call c2) {
        return c1.start < c2.start ? -1 : c1.start > c2.start ? 1 : 0;
      }
    });

    // Add all calls to the sorted set.
    for (int i = 0; i < n; i++) {
      calls.add(new Call(x[i], y[i]));
    }

    int max = 0;
    // Active calls sorted by end time
    TreeSet<Call> activeCalls = new TreeSet<Call>(new Comparator<Call>() {
      public int compare(Call c1, Call c2) {
        return c1.end < c2.end ? -1 : c1.end > c2.end ? 1 : 0;
      }
    });

    for (Call call: calls) {
      // Remove all calls that end before the current call starts.
      while(activeCalls.size() > 0 && activeCalls.first().end < call.start) {
        activeCalls.pollFirst();
      }
      activeCalls.add(call);
      if (activeCalls.size() > max) {
        max = activeCalls.size();
      }
    }
    return max;
  }
}

运行时间应该是 O(n log n)

PS：如果我们可以假设呼叫已经按开始时间排序，则应该可以简化这一点。

score 1 · Accepted Answer

您的算法非常慢，因为它实际上测试了所有可能的情况，即 O(n^2)。

假设您的电话在您收到电话时是有序的，这里是一个 O(n) 算法：[编辑：应该对第二个数组进行排序]

    int max;
    int i=0,j=0,count=0;
    while(i<n && j<n){
        if(x[i]<y[j]){ //new call received
            count++;
            max = count>max? count:max;
            i++;
        }else if(x[i]==x[j]){ //receive new call at the same time of end call
            i++;
            j++;
        }else { //call ended
            count--;
            j++;
        }
    }
    return max;
  }

[注意：此代码很可能会抛出数组索引超出范围错误，但应该足以证明这个想法，以便您可以实现其余部分]

如果调用未排序，则算法为 O(n lg n)：

array_of_calldata a = x union y
a.sort();
foreach(calldata d in a){
    if (d is new call) count++;
    else count--;
}
return max_value_of_count;

score 0 · Accepted Answer

使用两个列表，将 X[i] Y[i] 对添加到这些列表中。第一个列表按呼叫开始时间排序，第二个列表按结束时间排序。遍历列表，仅步进最低时间列表。

class Call {
    int start;
    int end;
}

Call callsSortedOnStart[];
Call callsSortedOnEnd[];

int indexStart = 0;  // Position in the array
int indexEnd = 0;

int nCalls = 0;      // Current density of calls
int maxCalls = 0;    // Maximum density of calls

while(indexStart < callsSortedOnStart.length && indexEnd < callsSortedOnEnd.length) {
    while(callsSortedOnStart[indexStart].start <= callsSortedOnEnd[indexEnd].end) {
        indexStart++;
        nCalls++;
    }
    maxCalls = max(maxCalls, nCalls);

    while(callsSortedOnStart[indexStart].start > callsSortedOnEnd[indexEnd].end) {
        indexEnd++;
        nCalls--;
    }
}

score 0 · Accepted Answer

制作一系列通话事件。呼叫事件只是一个具有时间字段和启动字段的结构，其值为 +1 或 -1 用于呼叫开始和呼叫结束。按时间字段排序此数组（如果时间相等，则使用第二个字段，结束事件在开始事件之前）。初始化 CurrentCalls = 0。迭代数组，将 StartEnd 字段添加到 CurrentCalls。数组扫描期间 CurrentCalls 的最大值是您所需要的。

score 0 · Accepted Answer

按开始时间对持续时间进行排序。这样，当您的内部循环中的开始时间超出外部循环提供的范围时，您可以break使用内部循环。

java - 用于计算活动呼叫的更快算法

7 回答 7

Related

Reference