但是structFmt从文件中读取 1024 个结构是有问题的(当然对我来说)。我认为,读取 1024 次 struct 并将其附加到列表中是一种开销。我不想使用像numpy.


我会查看对文件进行映射,然后使用 ctypes 类方法 from_buffer() 调用。这将映射 ctypes 定义的结构数组http://docs.python.org/library/ctypes#ctypes-arrays

这将结构映射到 mmap 文件上,而无需显式读取/转换和复制内容。


只是为了好玩,这里是一个使用 mmap 的快速示例。(我使用 dd 创建了一个文件dd if=/dev/zero of=./test.dat bs=96 count=10240

from ctypes import Structure
from ctypes import c_char, c_long, c_double
import mmap
import timeit

class StructFMT(Structure):
     _fields_ = [('ch',c_char * 64),('lo',c_long *2),('db',c_double * 3)]

d_array = StructFMT * 1024

def doit():
    f = open('test.dat','r+b')
    m = mmap.mmap(f.fileno(),0)
    data = d_array.from_buffer(m)

    for i in data:
        i.ch, i.lo[0]*10 ,i.db[2]*1.0   # just access each row and bit of the struct and do something, with the data.


if __name__ == '__main__':
    from timeit import Timer
    t = Timer("doit()", "from __main__ import doit")
    print t.timeit(number=10)
Alas, there is no analog for array that holds complex structs.

The usual technique is to make many calls to struct.unpack and append the results to a list.

structFmt = "=64s 2L 3d"    # char[ 64 ] long[ 2 ] double [ 3 ]
structLen = struct.calcsize( structFmt )
results = []
with open( "path/to/file", "rb" ) as f:
    structBytes = f.read( structLen )
    s = struct.unpack( structFmt, structBytes )

If you're concerned about being efficient, know that struct.unpack caches the parsed structure between successive calls.

