matlab - For large sparse matrices in MATLAB, compute the cumulative sum across the columns for non-zero entries?

Question

In MATLAB have a large matrix with transition probabilities transition_probs, and an adjacency matrix adj_mat. I want to compute the cumulative sum of the transition matrix along the columns and then element wise multiply it against the adjacency matrix which acts as a mask in this way:

cumsumTransitionMat = cumsum(transition_probs,2) .* adj_mat;

I get a MEMORY error because with the cumsum all the entries of the matrix are then non-zero.

I would like to avoid this problem by only having the cumulative sum entries where there are non zero entries in the first place. How can this be done without the use of a for loop?

score 2 · Accepted Answer

当 CUMSUM 应用于行时，对于每一行，它将填充从它找到的第一个非零列开始直到最后一列的值，这就是它的定义。

就存储而言，最坏的情况是稀疏矩阵在第一列包含值，最好的情况是所有非零值都出现在最后一列。例子：

% worst case
>> M = sparse([ones(5,1) zeros(5,4)]);
>> MM = cumsum(M,2);       % completely dense matrix
>> nnz(MM)
ans =
    25

% best case
>> MM = cumsum(fliplr(M),2);

如果生成的矩阵不适合内存，我看不出你还能做什么，除了可能在行上使用 for 循环，并且处理矩阵是较小的批次......

请注意，您不能在计算累积和之前应用屏蔽操作，因为这会改变结果。所以你不能说cumsum(transition_probs .* adj_mat, 2)。

score 2 · Accepted Answer

您只能应用于cumsum非零元素。这是一些代码：

A = sparse(round(rand(100,1)));        %some sparse data
A_cum = A;                             %instantiate A_cum by copy A

idx_A = find(A);                       %find non-zeros
A_cum(idx_A) = cumsum(A(idx_A)); %cumsum on non-zeros elements only

您可以检查输出

 B = cumsum(A);


   A_cum   B
     1     1
     0     1
     0     1
     2     2
     3     3
     4     4
     5     5
     0     5
     0     5
     6     6

并isequal(A_cum(find(A_cum)), B(find(A_cum)))给出1。

matlab - For large sparse matrices in MATLAB, compute the cumulative sum across the columns for non-zero entries?

2 回答 2

Related

Reference