matrix - Numpy加载并保存ijv（行列值）格式

Question

我想将一个文件读入一个 numpy 矩阵。在这个文件中，每一行都有“行；列；值”的结构，所以像这样的矩阵

m = numpy.matrix([[1,2,3],[4,5,6]])

将位于以以下行开头的文件中：

0;0;1
0;1;2
0;2;3
1;0;4
...

我没有发现在 Numpy 中加载和保存此类文件的内置方式，手动执行此操作可能非常慢。你会建议哪种方式？

score 2 · Accepted Answer

没有内置的方法。正如评论中所指出的，保存形状通常要快得多，然后保存矩阵的平面 1D 版本。但是，除非您的矩阵很大，否则手动操作并不是一个巨大的瓶颈。有很多方法可以做到这一点，这里有一个例子numpy.nditer：

m = np.matrix([[1,2,3],[4,5,6]])
f = open('output.txt', 'w')
it = np.nditer(m, flags=['multi_index'])
while not it.finished:
    f.write('%i;%i;%i\n' % (it.multi_index[0], it.multi_index[1], it[0]))
    it.iternext()

这将给出：

0;0;1
0;1;2
0;2;3
1;0;4
1;1;5
1;2;6

score 1 · Accepted Answer

您可以构建一些简单的函数来进行这些转换：

def to_ijv(a):
    rows, cols = a.shape
    ijv = np.empty((a.size,), dtype=[('i', np.intp),
                                     ('j', np.intp),
                                     ('v', a.dtype)])
    ijv['i'] = np.repeat(np.arange(rows), cols)
    ijv['j'] = np.tile(np.arange(cols), rows)
    ijv['v'] = a.ravel()
    return ijv

def from_ijv(ijv):
    rows, cols = np.max(ijv['i']) + 1, np.max(ijv['j']) + 1
    a = np.empty((rows, cols), dtype=ijv['v'].dtype)
    a[ijv['i'], ijv['j']] = ijv['v']
    return a

If your matrices are large, you can use the built-in loadtxt and savetxt to read and write to disk:

def save_ijv(file_, a):
    ijv = to_ijv(a)
    np.savetxt(file_, ijv, delimiter=';', fmt=('%d', '%d', '%f'))

def read_ijv(file_):
    ijv = np.loadtxt(file_, delimiter=';',
                     dtype=[('i', np.intp),('j', np.intp),
                            ('v', np.float)])
    return from_ijv(ijv)

These functions have a liking for floating point numbers, so you will have to explicitly edit the format if you want e.g. integers. Other than that it works nicely:

>>> a = np.arange(1, 7).reshape(3, 2)
>>> a
array([[1, 2],
       [3, 4],
       [5, 6]])
>>> to_ijv(a)
array([(0L, 0L, 1), (0L, 1L, 2), (1L, 0L, 3), (1L, 1L, 4), (2L, 0L, 5),
       (2L, 1L, 6)], 
      dtype=[('i', '<i8'), ('j', '<i8'), ('v', '<i4')])
>>> import StringIO as sio
>>> file_ = sio.StringIO()
>>> save_ijv(file_, a)
>>> print file_.getvalue()
0;0;1.000000
0;1;2.000000
1;0;3.000000
1;1;4.000000
2;0;5.000000
2;1;6.000000

>>> file_.pos = 0
>>> b = read_ijv(file_)
>>> b
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.]])

matrix - Numpy加载并保存ijv（行列值）格式

2 回答 2

Related

Reference