5

有人知道如何在 python 中使用巨大的矩阵吗?我必须使用形状为 (10^6,10^6) 的邻接矩阵并执行包括加法、缩放和点积在内的操作。使用 numpy 数组我遇到了 ram 问题。

4

1 回答 1

6

How about something like this...

import numpy as np

# Create large arrays x and y.
# Note they are 1e4 not 1e6 b/c of memory issues creating random numpy matrices (CookieOfFortune) 
# However, the same principles apply to larger arrays
x = np.random.randn(10000, 10000)
y = np.random.randn(10000, 10000)

# Create memory maps for x and y arrays
xmap = np.memmap('xfile.dat', dtype='float32', mode='w+', shape=x.shape)
ymap = np.memmap('yfile.dat', dtype='float32', mode='w+', shape=y.shape)

# Fill memory maps with data
xmap[:] = x[:]
ymap[:] = y[:]

# Create memory map for out of core dot product result
prodmap = np.memmap('prodfile.dat', dtype='float32', mode='w+', shape=x.shape)

# Due out of core dot product and write data
prodmap[:] = np.memmap.dot(xmap, ymap)

# Create memory map for out of core addition result
addmap = np.memmap('addfile.dat', dtype='float32', mode='w+', shape=x.shape)

# Due out of core addition and write data
addmap[:] = xmap + ymap

# Create memory map for out of core scaling result
scalemap = np.memmap('scalefile.dat', dtype='float32', mode='w+', shape=x.shape)

# Define scaling constant
scale = 1.3

# Do out of core  scaling and write data
scalemap[:] = scale * xmap

This code will create files xfile.dat, yfile.dat, ect that contain the arrays in binary format. To access them later you simply need to do np.memmap(filename). Other arguments to np.memmap are optional, but reccomended (arguments like dtype, shape, ect.).

于 2013-03-26T21:22:19.727 回答