algorithm - 找到非重叠序列的最大覆盖范围的算法。（即，加权间隔调度问题。）

Question

与链接问题的唯一区别是，我不需要找到代表最长序列的非重叠元组集，而是需要找到代表最大覆盖率的非重叠元组集，我的意思是总和元组长度最大（元组长度在下一句中last - first + 1给出元组的定义）。

我表示我的元组与链接问题不同。而不是(starting index, length)，我将我的元组表示为(first, last); 例如，元组(3,7)表示数字的集合[3, 4, 5, 6, 7]。（即使端点匹配，一个元组也会与另一个元组重叠；即重叠(2,6)，因此不能同时出现在解决方案中。）(6,8)

例如，考虑以下一组元组，按排序first：

[(0,100), (2,50), (30,150), (60,95), (110,190), (120,150), (191,200)]

此处的最大值将[(0,100), (110,190), (191,200)]覆盖101 + 81 + 10 = 192. （请注意，此解决方案中的元组是不重叠的。）

解决此问题的最简单算法的示例是什么，该算法的复杂性是多少？（如果这个问题可以解决就好了O(N)，但我目前不知道是否可以。）

附录：回想起来，事实证明我在这里提出的问题等同于加权间隔调度问题。这是间隔调度问题的一个特例。

@j_random_hacker 下面的答案实际上是加权间隔调度问题的已知解决方案，其时间复杂度为O(N log(N)).

score 4 · Accepted Answer

这是一个O(nlog n)-时间，O(n)-空间算法。首先，如果元组数组尚未按此顺序排列，则按它们的起始位置对它们进行排序。我将假设从零开始的数组索引。

让我们称元组的开始位置为 ib(i) 和结束位置为 e(i)，因此它的总长度为 e(i) - b(i) + 1。另外让我们定义一个函数 next(i)，它返回可以出现在元组 i 右侧的第一个元组的元组列表中的位置。请注意，next(i) 可以用二分查找在 O(log n) 时间内计算出来：只需将所有元组起始位置 b(i) 保留在数组 b[] 中，然后在子数组 b[ 中搜索第一个 j i+1 .. n-1] 具有 b[j] > e(i)。

让我们将 f(i) 定义为在元组 i处或之后开始的任何非重叠元组集合的最大覆盖率。由于元组 i 本身是否在这个最优集合中，我们有：

f(i) = max(e(i) - b(i) + 1 + f(next(i)), f(i+1)) for 0 <= i < n

我们也有边界条件f(n) = 0。

显然，最大可能的覆盖范围由 f(0) 给出。这很容易计算。在伪 C++ 中：

int b[] = /* Tuple beginning positions, in nondecreasing order */;
int e[] = /* Tuple end positions */;
int n = /* Number of tuples */;

// Find the array position of the leftmost tuple that begins to the right of
// where tuple i ends.
int next(int i) {
    return upper_bound(b + i + 1, b + n, e[i]);
}

int maxCov[n + 1];    // In practice you should dynamically allocate this

// After running this, maxCov[i] will contain the maximum coverage of any
// nonoverlapping subset of the set of n - i tuples whose beginning positions
// are given by b[i .. n-1] and whose ending points are given by e[i .. n-1].
// In particular, maxCov[0] will be the maximum coverage of the entire set.
void calc() {
    maxCov[n] = 0;
    for (int i = n - 1; i >= 0; --i) {
        maxCov[i] = max(e[i] - b[i] + 1 + maxCov[next(i)], maxCov[i + 1]);
    }
}

循环calc()运行 n 次，每次迭代对二分搜索函数进行一次 O(log n) 调用upper_bound()。

我们可以通过计算 f(0) 的 max() 的两个输入来重构这个大小的实际集合，查看哪个实际产生了最大值，记录它是否意味着元组 0 的存在，然后递归处理余数（对应于 f(next(0)) 或 f(1)）。（如果两个输入相等，则有多个最优解，我们可以遵循其中一个。）

score 2 · Accepted Answer

下面的算法通过递归检索每个元素是最左边成员的最大非重叠集合，然后返回覆盖范围最大的集合来工作。请参阅代码中的注释。

在 PHP 中实现。你可以在这里测试它http://viper-7.com/xowTRF

我认为这个算法的复杂性是缓存O(2^N)或O(N^2)缓存，如果您不同意，请随时发表评论。

$set = [[0,100], [2,50], [30,150], [60,95], [110,190], [120,150], [191,200]];
$GLOBALS['cache'] = array(); //cache for overlapping sub problems

function maximumSet($set) {

    if(count($set) === 1) {
        return $set;
    }

    $cache_key = [];

    foreach($set as $k) {
        $cache_key[] = implode($k,'.');
    }

    $cache_key = implode('_',$cache_key);

    if(isset($GLOBALS['cache'][$cache_key])) {
        return $GLOBALS['cache'][$cache_key];
    }

    $maximumResult = null;

    //for each element in the set,
    //get the largest non-overlapping set that the element is a member of
    //once all N sets have been computed, return the largest one
    foreach($set as $k => $element) {

        unset($set[$k]);

        //create a new set $copy, which contains the remaining elements that
        //do not overlap with $element            
        $copy = $set;

        foreach($set as $k2 => $element2) {
            if($element2[0] <= $element[1]) { 
                //element is considered non overlapping if its start 
                //is less than or equal to current element's end
                unset($copy[$k2]);
            }
            else break; //because the list is sorted we can break the 1st time
            //see a non-overlapping element
        }

        if(!empty($copy)) {
            //if there is at least one non-overlapping element
            //recursively get the maximum set
            $maximumSubset = maximumSet($copy);
            //prepend the current element to it
            array_unshift($maximumSubset,$element);
        }
        else {
            //otherwise the maximum non-overlapping set which contains this element
            //is the element itself                
            $maximumSubset = [$element];
        }

        //store the set in the results by coverage
        $coverage = getCoverage($maximumSubset);
        if(is_null($maximumResult) || $maximumResult['coverage'] < $coverage) {
            $maximumResult = [
                'coverage' => $coverage,
                'set' => $maximumSubset
            ];
        }
    }

    $GLOBALS['cache'][$cache_key] = $maximumResult['set'];
    return $maximumResult['set'];
}

function getCoverage($set) {
    $range = 0;
    foreach($set as $v) {
        $range += ($v[1] - $v[0]);
    }
    return $range;
}

$x = maximumSet($set);
print "<pre>";
print_r($x);
print "</pre>";

algorithm - 找到非重叠序列的最大覆盖范围的算法。（即，加权间隔调度问题。）

2 回答 2

Related

Reference