python - Python：读取 TIFFArray 时内存使用量增加

Question

我有一个“TIFFFiles”列表，其中每个“TIFFFiles”都包含一个“TIFFArray”，其中包含 60 个 tiff 图像，每个图像的大小为 2776x2080 像素。图像被读取为 numpy.memmap 对象。我想访问图像的所有强度（imgs 的形状：（60,2776,2080））。我使用以下代码：

for i in xrange(18):

    #get instance of type TIFFArray from tiff_list
    tiffs = get_tiff_arrays(smp_ppx, type_subfile,tiff_list[i])

    #accessing all intensities from tiffs
    imgs = tiffs[:,:,:]

即使在每个迭代步骤中覆盖“tiffs”和“imgs”，我的内存也会增加 2.6GByte。如何避免在每个迭代步骤中复制数据？有什么办法可以复用2.6GByte的内存吗？

score 0 · Accepted Answer

我知道这可能不是答案，但无论如何它可能会有所帮助，而且评论时间太长了。

前段时间我在读取大型（>1Gb）ascii 文件时遇到了内存问题numpy：基本上是用读取文件numpy.loadtxt，代码使用了整个内存（8Gb）加上一些交换。

据我了解，如果您事先知道要填充的数组的大小，则可以分配它并将其传递给例如loadtxt. 这应该可以防止numpy分配临时对象，并且在内存方面可能会更好。

mmap，或类似的方法，可以帮助提高内存使用率，但我从未使用过它们。

编辑

内存使用和释放的问题让我想知道何时尝试解决我的大文件问题。基本上我有

def read_f(fname):
    arr = np.loadtxt(fname)  #this uses a lot of memory
    #do operations
    return something  
for f in ["verylargefile", "smallerfile", "evensmallerfile"]:
    result = read_f(f)

从我所做的内存分析来看，返回loadtxt时没有内存释放，也没有返回read_f并用较小的文件再次调用它。

python - Python：读取 TIFFArray 时内存使用量增加

1 回答 1

Related

Reference