algorithm - 最大化以值为界的排序数组的总和中贡献的因素数量

Question

我有一个大小为 n 的排序整数数组。这些值不是唯一的。我需要做的是：给定一个 B，我需要找到一个i<A[n]这样的总和|A[j:1 to n]-i|小于 B 并且对该特定总和贡献最大数量的 A[j]s。我有一些想法，但我似乎无法从天真的 n*B 和 n*n 算法中找到更好的东西。关于 O(nlogn) 或 O(n) 的任何想法？例如：想象一下

A[n] = 1 2 10 10 12 14 和 B<7 那么最好的 i 是 12 因为我实现了 4 个 A[j] 对我的总和有贡献。如果 i=10 我得到 10 - 10 + 10 - 10 +12-10 + 14-10 = 6<7，则 10 和 11 也同样好

score 1 · Accepted Answer

O(n) 中的解决方案：从末尾开始计算 a[n]-a[n-1] ：让 d=14-12 => d=2 和 r=Bd => r=5，然后重复运算但将 d 乘以 2： d=12-10 => d=2 and r=r-2*d => r=1, r=1 算法结束，因为总和必须小于 B：

索引为 0..n-1 的数组

i=1
r=B
while(r>0 && n-i>1) {
  d=a[n-i]-a[n-i-1];
  r-=i*d;
  i++;
}
return a[n-i+1];

也许一张图解释得更好

14       x
13       x  -> 2
12      xx
11      xx  -> 2*2
10    xxxx    -> 3*0
 9    xxxx   
 8    xxxx
 7    xxxx
 6    xxxx
 5    xxxx
 4   xxxxx
 3   xxxxx
 2  xxxxxx
 1 xxxxxxx

score 0 · Accepted Answer

我认为您可以使用以下三个技巧在 O(n) 中做到这一点：

累计金额

预先计算一个存储 sum(A[0:k]) 的数组 C[k]。
这可以通过 C[k]=C[k-1]+A[k] 在时间 O(n) 中递归地完成。这个数组的好处是你可以通过 C[b]-C[a-1] 计算 sum(A[a:b])。

最佳中点

因为您的元素已排序，所以很容易计算最佳 i 以最小化绝对值的总和。事实上，最好的 i 总是由中间条目给出。如果列表的长度是偶数，则两个中心元素之间的所有 i 值将始终给出最小绝对值。

例如，对于您的列表 10、10、12、14，中心元素是 10 和 12，因此 i 在 10 和 12 之间的任何值都会使总和最小化。

迭代搜索

您现在可以一次扫描元素以找到最佳值。

1. Init s=0,e=0
2. if the score for A[s:e] is less than B increase e by 1
3. else increase s by 1
4. if e<n return to step 2

跟踪所见分数 < B 的 es 的最大值，这就是你的答案。

这个循环最多可以循环 2n 次，所以它是 O(n)。

A[s:e] 的分数由总和 |A[s:e]-A[(s+e)/2]| 给出。

令 m=(s+e)/2。

score = sum |A[s:e]-A[(s+e)/2]| 
= sum |A[s:e]-A[m]|
= sum (A[m]-A[s:m]) + sum (A[m+1:e]-A[m])
= (m-s+1)*A[m]-sum(A[s:m]) + sum(A[m+1:e])-(e-m)*A[m]

我们可以使用预先计算的数组 C[k] 来计算这个表达式中的总和。

编辑

如果端点必须始终为 n，那么您可以使用此替代算法：

1. Init s=0,e=n
2. while the score for A[s:e] is greater than B, increase s by 1

蟒蛇代码

这是该算法的python实现：

def fast(A,B):
    C=[]
    t=0
    for a in A:
        t+=a
        C.append(t)

    def fastsum(s,e):
        if s==0:
            return C[e]
        else:
            return C[e]-C[s-1]

    def fastscore(s,e):
        m=(s+e)//2
        return (m-s+1)*A[m]-fastsum(s,m)+fastsum(m+1,e)-(e-m)*A[m]

    s=0
    e=0
    best=-1
    while e<len(A):
        if fastscore(s,e)<B:
            best=max(best,e-s+1)
            e+=1
        elif s==e:
            e+=1
        else:
            s+=1
    return best

print fast([1,2,10,10,12,14],7)
# this returns 4, as the 4 elements 10,10,12,14 can be chosen

score 0 · Accepted Answer

尝试这种O(N) with N size of array方法：

minpos = position of closest value to B in array (binary search, O(log(N))
min = array[minpos]

if (min >= B) EXIT, no solution

// now, we just add the smallest elements from the left or the right
// until we are greater than B

leftindex = minpos - 1
rightindex = minpos + 1

while we have a valid leftindex or valid rightindex:
    add = min(abs(array[leftindex (if valid)]-B), abs(array[rightindex (if valid)]-B))
    if (min + add >= B)
        break
    min += add
    decrease leftindex or increase rightindex according to the usage

min is now our sum, rightindex the requested i (leftindex the start)

（可能会发生某些索引不正确，这只是想法，而不是实现）

我猜，小 b 的平均情况是O(log(N)). 只有当我们可以使用整个数组时，才会发生线性情况。

我不确定，但也许这也可以在中完成O(log(N)*k) with N size of array and k < N。我们必须巧妙地使用 bin 搜索在每次迭代中找到 leftindex 和 rightindex，这样每次迭代可能的结果范围就会变小。这很容易做到，但我们必须注意重复，因为它们可能会破坏我们的 bin 搜索缩减。

algorithm - 最大化以值为界的排序数组的总和中贡献的因素数量

3 回答 3

累计金额

最佳中点

迭代搜索

编辑

蟒蛇代码

Related

Reference