尝试使用内存映射文件。
适度的内存使用和可忍受的快速
如果您负担得起将其中一个数组存储在内存中,那么这将是可以忍受的快:
import numpy as np
from scipy.ndimage import gaussian_filter
# create some fake data, save it to disk, and free up its memory
shape = (10000,10000)
orig = np.random.random_sample(shape)
orig.tofile('orig.dat')
print 'saved original'
del orig
# allocate memory for the smoothed data
smoothed = np.zeros((10000,10000))
# memory-map the original data, so it isn't read into memory all at once
orig = np.memmap('orig.dat', np.float64, 'r', shape=shape)
print 'memmapped'
sigma = 10 # I have no idea what a reasonable value is here
gaussian_filter(orig, sigma, output = smoothed)
# save the smoothed data to disk
smoothed.tofile('smoothed.dat')
内存使用率低且非常慢
如果您不能一次将任一数组都放在内存中,则可以对原始数组和平滑数组进行内存映射。至少在我的机器上,这具有非常低的内存使用率,但速度非常慢。
您必须忽略此代码的第一部分,因为它会欺骗并立即创建原始数组,然后将其保存到磁盘。您可以将其替换为代码以加载您在磁盘上增量构建的数据。
import numpy as np
from scipy.ndimage import gaussian_filter
# create some fake data, save it to disk, and free up its memory
shape = (10000,10000)
orig = np.random.random_sample(shape)
orig.tofile('orig.dat')
print 'saved original'
del orig
# memory-map the original data, so it isn't read into memory all at once
orig = np.memmap('orig.dat', np.float64, 'r', shape=shape)
# create a memory mapped array for the smoothed data
smoothed = np.memmap('smoothed.dat', np.float64, 'w+', shape = shape)
print 'memmapped'
sigma = 10 # I have no idea what a reasonable value is here
gaussian_filter(orig, sigma, output = smoothed)