我想将一个文件读入一个 numpy 矩阵。在这个文件中,每一行都有“行;列;值”的结构,所以像这样的矩阵
m = numpy.matrix([[1,2,3],[4,5,6]])
将位于以以下行开头的文件中:
0;0;1
0;1;2
0;2;3
1;0;4
...
我没有发现在 Numpy 中加载和保存此类文件的内置方式,手动执行此操作可能非常慢。你会建议哪种方式?
没有内置的方法。正如评论中所指出的,保存形状通常要快得多,然后保存矩阵的平面 1D 版本。但是,除非您的矩阵很大,否则手动操作并不是一个巨大的瓶颈。有很多方法可以做到这一点,这里有一个例子numpy.nditer
:
m = np.matrix([[1,2,3],[4,5,6]])
f = open('output.txt', 'w')
it = np.nditer(m, flags=['multi_index'])
while not it.finished:
f.write('%i;%i;%i\n' % (it.multi_index[0], it.multi_index[1], it[0]))
it.iternext()
这将给出:
0;0;1
0;1;2
0;2;3
1;0;4
1;1;5
1;2;6
您可以构建一些简单的函数来进行这些转换:
def to_ijv(a):
rows, cols = a.shape
ijv = np.empty((a.size,), dtype=[('i', np.intp),
('j', np.intp),
('v', a.dtype)])
ijv['i'] = np.repeat(np.arange(rows), cols)
ijv['j'] = np.tile(np.arange(cols), rows)
ijv['v'] = a.ravel()
return ijv
def from_ijv(ijv):
rows, cols = np.max(ijv['i']) + 1, np.max(ijv['j']) + 1
a = np.empty((rows, cols), dtype=ijv['v'].dtype)
a[ijv['i'], ijv['j']] = ijv['v']
return a
If your matrices are large, you can use the built-in loadtxt
and savetxt
to read and write to disk:
def save_ijv(file_, a):
ijv = to_ijv(a)
np.savetxt(file_, ijv, delimiter=';', fmt=('%d', '%d', '%f'))
def read_ijv(file_):
ijv = np.loadtxt(file_, delimiter=';',
dtype=[('i', np.intp),('j', np.intp),
('v', np.float)])
return from_ijv(ijv)
These functions have a liking for floating point numbers, so you will have to explicitly edit the format if you want e.g. integers. Other than that it works nicely:
>>> a = np.arange(1, 7).reshape(3, 2)
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> to_ijv(a)
array([(0L, 0L, 1), (0L, 1L, 2), (1L, 0L, 3), (1L, 1L, 4), (2L, 0L, 5),
(2L, 1L, 6)],
dtype=[('i', '<i8'), ('j', '<i8'), ('v', '<i4')])
>>> import StringIO as sio
>>> file_ = sio.StringIO()
>>> save_ijv(file_, a)
>>> print file_.getvalue()
0;0;1.000000
0;1;2.000000
1;0;3.000000
1;1;4.000000
2;0;5.000000
2;1;6.000000
>>> file_.pos = 0
>>> b = read_ijv(file_)
>>> b
array([[ 1., 2.],
[ 3., 4.],
[ 5., 6.]])