10

我正在使用 scipy 对一些大数据进行稀疏矩阵 svd。matix 的大小约为 200,000*8,000,000,有 1.19% 的非零条目。我使用的机器有 160G 内存,所以我想内存应该不是问题。

所以这是我使用的一些代码:

from scipy import *
from scipy.sparse import *
import scipy.sparse.linalg as slin
from numpy import *
K=1500
coom=coo_matrix((value,(row,col)),shape=(M,N))
coom=coom.astype('float32')
u,s,v=slin.svds(coom,K,ncv=8*K)

错误信息如下:

Traceback (most recent call last):
  File "sparse_svd.py", line 35, in <module>
    u,s,v=slin.svds(coom,K,ncv=2*K+1)
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 731, in svds
    eigvals, eigvec = eigensolver(XH_X, k=k, tol=tol**2)
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 680, in eigsh
    params.iterate()
  File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 278, in iterate
    raise ArpackError(self.info)
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error 3: No shifts could be applied during a cycle of the Implicitly restarted Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV.

当K=1000(即#eigen values=1000)时一切正常。当我尝试 K>=1250 时,错误开始出现。我也尝试了各种 ncv 值,仍然得到相同的错误消息......

任何建议和帮助表示赞赏。非常感谢 :)

4

0 回答 0