matlab - 在序列中查找零岛

Question

想象一下你有一个很长的序列。找到序列全为零的区间的最有效方法是什么（或更准确地说，序列下降到接近零的值abs(X)<eps）：

为简单起见，我们假设以下顺序：

sig = [1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0];

我正在尝试获取以下信息：

startIndex   EndIndex    Duration
3            6           4
12           12          1
14           16          3
25           26          2
30           30          1

然后使用此信息，我们找到持续时间 >= 到某个指定值（例如3）的区间，并返回所有这些区间中的值的索引组合：

indices = [3 4 5 6 14 15 16];

最后一部分与上一个问题有关：

MATLAB：从开始/结束索引列表创建矢量化数组

这是我到目前为止所拥有的：

sig = [1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0];
len = length(sig);
thresh = 3;

%# align the signal with itself successively shifted by one
%# v will thus contain 1 in the starting locations of the zero interval
v = true(1,len-thresh+1);
for i=1:thresh
    v = v & ( sig(i:len-thresh+i) == 0 );
end

%# extend the 1's till the end of the intervals
for i=1:thresh-1
    v(find(v)+1) = true;
end

%# get the final indices
v = find(v);

我正在寻找矢量化/优化代码，但我对其他解决方案持开放态度。我必须强调空间和时间效率非常重要，因为我正在处理大量的长生物信号。

score 33 · Accepted Answer

这些是我将以向量化的方式解决您的问题的步骤，从给定的向量开始sig：

首先，对向量设置阈值以获得零和一的向量tsig（信号的绝对值下降到足够接近零的零，其他地方的零）：
```
tsig = (abs(sig) >= eps);  %# Using eps as the threshold
```

接下来，使用函数DIFF和FIND找到每个零字符串的起始索引、结束索引和持续时间：

dsig = diff([1 tsig 1]);
startIndex = find(dsig < 0);
endIndex = find(dsig > 0)-1;
duration = endIndex-startIndex+1;

然后，找到持续时间大于或等于某个值的零字符串（例如您的示例中的 3）：

stringIndex = (duration >= 3);
startIndex = startIndex(stringIndex);
endIndex = endIndex(stringIndex);

最后，使用我对链接问题的回答中的方法来生成最终的索引集：

indices = zeros(1,max(endIndex)+1);
indices(startIndex) = 1;
indices(endIndex+1) = indices(endIndex+1)-1;
indices = find(cumsum(indices));

score 10 · Accepted Answer

thresh您可以通过查找长度为零的字符串（STRFIND 函数非常快）将此作为字符串搜索任务来解决

startIndex = strfind(sig, zeros(1,thresh));

请注意，较长的子字符串将在多个位置标记，但一旦我们从间隔 start atstartIndex到 end at添加中间位置，最终将被连接起来start+thresh-1。

indices = unique( bsxfun(@plus, startIndex', 0:thresh-1) )';

请注意，您始终可以将最后一步与链接问题中@gnovice 的 CUMSUM/FIND 解决方案交换。

score 1 · Accepted Answer

function indice=sigvec(sig,thresh)
    %extend sig head and tail to avoid 0 head and 0 tail

    exsig=[1,sig,1];
    %convolution sig with extend sig
    cvexsig=conv(exsig,ones(1,thresh));
    tempsig=double(cvexsig==0);

    indice=find(conv(tempsig,ones(1,thresh)))-thresh;

score 1 · Accepted Answer

可以修改 genovice 的上述答案以查找向量中非零元素的索引：

    tsig = (abs(sig) >= eps);
    dsig = diff([0 tsig 0]);
    startIndex = find(dsig > 0);
    endIndex = find(dsig < 0)-1;
    duration = endIndex-startIndex+1;

score 0 · Accepted Answer

我认为最 MATLAB/“矢量化”的方法是通过使用 [-1 1] 之类的滤波器计算信号的卷积。您应该查看函数 conv 的文档。然后在 conv 的输出上使用 find 来获取相关索引。

score 0 · Accepted Answer

正如 gnovice 所示，我们将进行阈值测试以使“接近零”真正为零：

logcl = abs(sig(:)) >= zero_tolerance;

然后找到累积和不增加的区域：

cs = cumsum(logcl);
islands = cs(1+thresh:end) == cs(1:end-thresh);

记住gnovice 填充索引范围的好方法

v = zeros(1,max(endInd)+1);   %# An array of zeroes
v(startInd) = 1;              %# Place 1 at the starts of the intervals
v(endInd+1) = v(endInd+1)-1;  %# Add -1 one index after the ends of the intervals
indices = find(cumsum(v));  %# Perform a cumulative sum and find the nonzero entries

我们注意到我们的islands向量已经在startInd位置中有一个，并且为了我们的目的，endInd总是在thresh后面出现点（更长的运行在中具有运行islands）

endcap = zeros(thresh,1);
indices = find(cumsum([islands ; endcap] - [endcap ; islands]))

测试

sig = [1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0];
logcl = abs(sig(:)) >= .1;
cs = cumsum(logcl);
islands = cs(1+thresh:end) == cs(1:end-thresh);
endcap = zeros(thresh,1);
indices = find(cumsum([islands ; endcap] - [endcap ; islands]))

matlab - 在序列中查找零岛

6 回答 6

测试

Related

Reference