2

我有两个矩阵,AB。(B是连续的像1:n

我需要找到 in 的每一行的所有出现BA并将这些行索引相应地存储在单元格数组C中。请参阅下面的示例。

A = [3,4,5;1,3,5;1,4,3;4,2,1]
B = [1;2;3;4;5]

因此,

C = {[2,3,4];[4];[1,2,3];[1,3,4];[1,2]}

注意C不需要在我的应用程序的单元格数组中。我只建议它,因为 的行向量C长度不等。如果您可以提出解决方法,这也很好。

我尝试对 的每一行使用运行 ismember 的循环,但是当矩阵很大B时,这太慢了A,大约有一百万个条目。B向量化代码表示赞赏。

(为了给你上下文,这样做的目的是在网格中识别那些附加到单个顶点的面。注意我不能使用函数 edgeattachments 因为我的数据在三角剖分表示中不是“TR”形式。我所拥有的只是一个面列表和顶点列表。)

4

2 回答 2

2

好吧,对此的最佳答案需要了解如何填充 A。如果 A 是稀疏的,也就是说,如果它的列值很少而 B 很大,那么我认为节省内存的最佳方法可能是使用稀疏矩阵而不是单元格。

% No fancy stuff, just fast and furious 
bMax = numel(B);
nRows = size(A,1);

cLogical = sparse(nRows,bMax);

for curRow = 1:nRows
  curIdx = A(curRow,:);
  cLogical(curRow,curIdx) = 1;
end

回答:

cLogical =

   (2,1)        1
   (3,1)        1
   (4,1)        1
   (4,2)        1
   (1,3)        1
   (2,3)        1
   (3,3)        1
   (1,4)        1
   (3,4)        1
   (4,4)        1
   (1,5)        1
   (2,5)        1

如何阅读答案。对于每一列,行显示列索引出现在 A 中的索引。即1出现在行中[2 3 4]2出现在行中[4]3行中[1 2 3]4行中[1 3 4]5行中[1 2]

然后,您cLogical将来可以根据需要使用代替单元格作为索引矩阵。

另一种方法是为 C 分配索引应在 C 中出现多少次的预期值。

% Fancier solution using some assumed knowledge of A
bMax = numel(B);
nRows = size(A,1);
nColumns = size(A,2);

% Pre-allocating with the expected value, an attempt to reduce re-allocations.
% tic; for rep=1:10000; C = mat2cell(zeros(bMax,nColumns),ones(1,bMax),nColumns); end; toc 
% Elapsed time is 1.364558 seconds.
% tic; for rep=1:10000; C = repmat({zeros(1,nColumns)},bMax,1); end; toc
% Elapsed time is 0.606266 seconds.
% So we keep the not fancy repmat solution
C = repmat({zeros(1,nColumns)},bMax,1);
for curRow = 1:nRows
  curIdxMsk = A(curRow,:);
  for curCol = 1:nColumns
    curIdx = curIdxMsk(curCol);
    fillIdx = ~C{curIdx};
    if any(fillIdx) 
      fillIdx = find(fillIdx,1);
    else
      fillIdx = numel(fillIdx)+1;
    end
    C{curIdx}(fillIdx) = curRow;
  end
end

% Squeeze empty indexes:
for curRow = 1:bMax
  C{curRow}(~C{curRow}) = [];
end

回答:

>> C{:}

ans =

     2     3     4


ans =

     4


ans =

     1     2     3


ans =

     1     3     4


ans =

     1     2

哪种解决方案性能最好?您在代码中进行性能测试,因为它取决于 A、bMax 的大小、计算机的内存大小等等。然而,我仍然对其他人可以为这个 x) 做的解决方案感到好奇。我喜欢 chappjc 的解决方案,尽管它有他指出的缺点。

对于给定的示例(10k 次):

Solution 1: Elapsed time is 0.516647 seconds. 
Solution 2: Elapsed time is 4.201409 seconds (seems that solution 2 is a bad idea hahaha, but since it was created to the specific issue of A having many rows it has to be tested in those conditions).
chappjc' solution: Elapsed time is 2.405341 seconds.
于 2013-10-04T14:45:19.567 回答
1

We can do it without making any assumptions about B. Try this use of bsxfun and mat2cell:

M = squeeze(any(bsxfun(@eq,A,permute(B,[3 2 1])),2)); % 4x3x1 @eq 1x1x5 => 4x3x5
R = sum(M); % 4x5 -> 1x5
[ii,jj] = find(M);
C = mat2cell(ii,R)

The cells in C above will be column vectors rather than rows as in your example. To make the cells contain row vectors, use C = mat2cell(ii',1,R)' instead.

My only concern is that mat2cell could be slow for millions of values of R, but if you want your output in a cell, I'm not sure how much better you can do. EDIT: If you can deal with a sparse matrix like in Werner's first solution with the loop, replace the last line of the above with the following:

>> Cs = sparse(ii,jj,1)
Cs =
   (2,1)        1
   (3,1)        1
   (4,1)        1
   (4,2)        1
   (1,3)        1
   (2,3)        1
   (3,3)        1
   (1,4)        1
   (3,4)        1
   (4,4)        1
   (1,5)        1
   (2,5)        1

Unfortunately, bsxfun will probably run out of memory if both size(A,1) and numel(B) are large! You may have to loop over the elements of A or B if memory becomes an issue. Here's one way to do it by looping over your vertexes in B:

for i=1:numel(B), C{i} = find(any(A==B(i),2)); end

Yup, that easy. Cell array growing is extremely fast in MATLAB as it similar to a sequence container that stores contiguous references to the data, rather than keeping the data itself contiguous. Perhaps ismember was the bottleneck in your test.

于 2013-10-04T05:09:41.307 回答