matlab - 大于阈值的值的数量

Question

我有一个矩阵A。现在我想找到大于 5 的元素数量及其对应的索引。如何在不使用 for 循环的情况下在 matlab 中解决这个问题？

例如，如果A = [1 4 6 8 9 5 6 8 9]'：

元素数 > 5: 6
指数：[3 4 5 7 8 9]

score 14 · Accepted Answer

14

您使用find：

index = find(A>5);
numberOfElements = length(index);

于 2012-10-05T00:29:26.930 回答

score 4 · Accepted Answer

您使用sum，它允许您使用一个命令获取元素的数量：

numberOfElements = sum(A>5);

你真的需要显式索引吗？因为逻辑矩阵A>5也可以用作索引（通常比使用索引更有效find）：

index = (A>5);
numberOfElements = sum(index);

为了完整性：逻辑索引与常规索引相同：

>> A(A>5)
ans = 
     6  8  9  6  8  9

score 2 · Accepted Answer

受上述与 Rody 讨论的启发，这里有一个简单的基准测试，用于测试 MATLAB 中整数与逻辑数组索引的速度。我要说的是一件很重要的事情，因为“矢量化” MATLAB 主要是关于索引的。所以

% random data
a = rand(10^7, 1);

% threashold - how much data meets the a>threashold criterion
% This determines the total indexing time - the more data we extract from a,
% the longer it takes.
% In this example - small threashold meaning most data in a 
% will meet the criterion.
threashold = 0.08;

% prepare logical and integer indices (note the uint32 cast)
index_logical = a>threashold;
index_integer = uint32(find(index_logical));

% logical indexing of a
tic
for i=1:10
    b = a(index_logical);
end
toc

% integer indexing of a
tic
for i=1:10
    b = a(index_integer);
end
toc

在我的电脑上结果是

Elapsed time is 0.755399 seconds.
Elapsed time is 0.728462 seconds.

这意味着这两种方法的执行几乎相同——这就是我选择示例的方式threashold。这很有趣，因为index_integer数组几乎大了 4 倍！

index_integer       9198678x1              36794712  uint32               
index_logical      10000000x1              10000000  logical

对于较大的threashold整数索引值，速度更快。结果threashold=0.5：

Elapsed time is 0.687044 seconds. (logical)
Elapsed time is 0.296044 seconds. (integer)

除非我在这里做错了什么，否则整数索引在大多数情况下似乎是最快的。

然而，在测试中创建索引会产生非常不同的结果：

a = rand(1e7, 1);    
threshold = 0.5;

% logical 
tic
for i=1:10
    inds = a>threshold;
    b = a(inds);
end
toc

% double
tic
for i=1:10
    inds = find(a>threshold);
    b = a(inds);
end
toc

% integer 
tic
for i=1:10
    inds = uint32(find(a>threshold));
    b = a(inds);
end
toc

结果（罗迪）：

Elapsed time is 1.945478 seconds. (logical)
Elapsed time is 3.233831 seconds. (double)
Elapsed time is 3.508009 seconds. (integer)

结果（angainor）：

Elapsed time is 1.440018 seconds. (logical)
Elapsed time is 1.851225 seconds. (double)
Elapsed time is 1.726806 seconds. (integer)

因此，使用整数索引时，实际索引似乎更快，但从前到后的逻辑索引性能要好得多。

最后两种方法之间的运行时差异是出乎意料的——看起来 Matlab 的内部要么不将双精度数转换为整数，要么在执行实际索引之前对每个元素执行错误检查。否则，我们将看到 double 和 integer 方法之间几乎没有区别。

编辑我看到有两个选项：

matlab 在索引调用之前将双索引显式转换为 uint32 索引（就像我们在整数测试中所做的那样）
matlab 在索引调用期间传递双精度并即时执行 double->int 强制转换

第二个选项应该更快，因为我们只需要读取一次双索引。在我们的显式转换测试中，我们必须读取双索引，写入整数索引，然后在实际索引期间再次读取整数索引。所以matlab应该更快......为什么不是？

matlab - 大于阈值的值的数量

3 回答 3

Related

Reference