arrays - 最短长度子集

Question

我有一个数组和一个数字 N。

数组可以填充数字 0,1,2,3....N。

例如，arr={1,0,2,3,1,0,2,4,3,1,0,2,4,3,0,0,0} //给定 N=4

我必须找到包含所有数字 1,2,...N 的最小长度子数组。

例如，上面数组的答案应该是 {1,0,2,3,1,0, 2,4,3,1 ,0,2,4,3,0,0,0}// length=4 , 并且索引 start=6,end=9, //0 基于

上述问题的一个可能答案是 {1,0,2, 3,1,0,2,4 ,3,1,0,2,4,3,0,0,0}，但由于其长度为 5，它被拒绝了..如果有一个以上长度最短的子数组，答案应该是第 1 次出现。或者，如果数组不包含 1,2..N 之间的一个或多个数字，则答案是“未找到子数组”。

这是我的python代码。它在某些情况下会产生错误的答案（我不知道）......如果有人能告诉我我做错了什么。

shortlen=2000001 //initialise to INFINITY
shortstart=0 
matchln=len(match) //match is the array containing integers

while(i<matchln):
   if(match[i]>0):
    leng=0
    pos=[0]*n // array to keep status of found integers
    j=i
    start=i
    sums=0
    while(j<matchln and sums!=n):
        if(match[j]>0):
            if(pos[match[j]-1]==0): //only update status if the integer is not marked previously.
                pos.pop(match[j]-1)
                pos.insert(match[j]-1,1) //(match[j]-1) becuz array indexing is from 0.
                sums+=1


        j+=1

    leng=j-i

    if(j==matchln and sums!=n): // if the loop terminated,without marking all integers,that means we shouldn't proceed.
        break

    if(leng<shortlen): //if the length calculated is smaller then existing,then update it.
        shortlen=leng
        shortstart=start

i+=1

score 1 · Accepted Answer

一种可能性是跟踪每个起始位置的最短长度。您可以通过对数组执行两次传递来做到这一点：

假设对于索引 1..k，您已经维护了在该位置之后找到的一组数字（在 1..N 内）（每个位置的集合不同），当您前进到位置 k+1 时，您需要更新所有集合（* ) 位置为 k+1 的数字（只要数字在 1..N 以内）。一旦一个集合包含 N 个元素，您就找到了该起始位置的最短序列，记录该位置的长度。

(*) 意识到对于具有完整集合的位置，您不再需要遍历它们。此外，一旦一个位置的集合已满，那么在此之前的位置集合也必须是满的，因此您可以保留一个滑动的“起始位置”来检查集合

您现在可以再一次通过来为每个位置选择最短的记录序列（您可以根据开始和序列长度计算结束位置）。

status = new array[arr.length] of Status // for score keeping
// initialize Status with: set <- empty, length <- n+1
startPos = 1 // sliding start position
// first pass
for i = 1..arr.length
  if arr[i] > 0 // within 1..N
    for j = startPos..i
      status[j].set.add(arr[i])
      if status[j].set.size == N // we have all numbers
        status[j].length = i-j;
         startPos = j+1

min = n+1 // for the shortest length
startPos = 1
// second pass
for i = 1..status.length
  if status[i].length < min
    min = status[i].length
    startPos = i

if min < n+1
  // found a winner
  print("start: " + startPos + ", end: " + startPos + min)

注意：上面代码中的索引从 1 开始（而不是从 0）

score 0 · Accepted Answer

如果允许额外的哈希表，则可以一次性完成。

基本上，您在数组上维护两个指针：left和right，最初都指向数组上的第一个元素。

在每一轮中，我们首先向右前进。在第一次移动之后，只要right指向与left相同的值，我们也会向左移动。我们当然会跳过 0。

在每一轮中，我们维护哈希表以查看从 1 到 N 的哪个值在区间 [left, right] 内，如果所有值都在区间内，我们得到区间长度。我们在整个过程中跟踪最小间隔长度。

时间复杂度为 O(Nn)

score 0 · Accepted Answer

我想这可能会对你有所帮助。您的问题的目标是将您的值转换为线性独立序列。我写了一个小代码来解决你的问题，它找到了你想要的序列的开头：

#include <stdio.h>
void main(){
/*By Volnei Klehm,
 Manaus-AM, Brazil
    2012
*/
long long arr[]={1,0,2,3,1,0,2,4,3,1,0,2,4,3,0,0,0};
long long power2arr[17]; /*same size or larger than arr*/
long long powerSum=0;
long long partialSum=0;
int count,count1,N;

int size_arr;
size_arr=sizeof(arr)/sizeof(long long);

/*the goal here is to find a way to represent your values as linear indepent ones,
      here a sequece of power of 2 is used, you can also use other ways to do it that not increases so dramatically in values.
      I use powers of 2 cause is more easy handled by computers, you can also use a sequence of sines, 
      cosines or any other math way that produces independets values. 
      For more informations I suggest you to take a look in linear algebra and/or digital signal processing books*/


/*Now it computes an independent basis*/

for(count=0 ; count<size_arr ; ++count){
    power2arr[count] = 1 << arr[count]; /*calculates 2^arr generating a set of independent numbers.*/

}

N=4; /*put here your N, for n=4 it will look for*/
/*Notice that deppending on your system, N cannot be too large,
      at certain point N values can make 2^N too large to be handled 
      by standard c/c++ types. Here is safe to use n up to 63.*/

/*now it gets the sum results of 2^0 + 2^1 + 2^2 ... + 2^N*/

++N; /* in C position starts at 0*/

for(count = 0 ; count < N ; count++)
    powerSum |= 1 << count;

for(count = 0 ; count<size_arr ; ++count){
    partialSum=0;
    for(count1 = count ; count1 < (count + N) ; count1++){
        if((count + N) > size_arr){
            printf("No occurrences found!\n");
            return;
        }
        partialSum |= power2arr[count1];    
    }
    if(partialSum==powerSum){
        printf("Ocurrence found at: %d\n", count);
        return;
    }
}

}

arrays - 最短长度子集

3 回答 3

Related

Reference