python - 读取 fortran 直接访问数据和写入格式化数据 - 使用 python 比使用 fortran 更快？

Question

晚上好，

我有一个用 Fortran 编写的模拟，它会生成大量未格式化（直接访问）数据的文件。从其中一些文件中，我想生成 ascii 人类可读的文件。

出于某种原因（在python中）：

f = open(filename,'rb')
for i in xrange(0,N):
    pos = i * 64
    f.seek(pos)
    name = struct.unpack('ffff',f.read(16))
    print name[0],name[1],name[2],name[3]

只需约 4 秒（将输出通过管道传输到 shell 上的文件中），而这（在 Fortran 中）

 open (1,file=inputfile,access='direct',recl=64, action='read',status="OLD")
 open (2, file=outputfile, access="sequential", action="write",status="REPLACE")
 do i=1,(N)
     read(1, rec = i ) a,b,c,d
     write(2,*) a,b,c,d
 enddo

大约需要 20 秒。我究竟做错了什么？在 Fortran 中有更快的方法吗？

此致！重新

score 4 · Accepted Answer

DISCLAIMER: I don't know, if this solves the problem, but I know, that I can get time differences by a factor of up to 20. I also did only test the output of data and didn't read it.

I was investigating the interaction of Fortran with python and as such wanted to know, how Fortran's binary files are build. While doing this, I noticed, that both ifort and gfortran have an option to switch buffered IO on or off.

ifort: You can specify the keyword BUFFERED=['YES'|'NO'] while opening a file.

gfortran: You can set the environmental variable GFORTRAN_UNBUFFERED_ALL to y|Y|1 or n|N|0 for unbuffered and buffered IO, respectively.

Please note, that gfortran does buffer IO by default, while ifort does not.

My sample code at the bottom results in the following times:

        |buffered|unbuffered
--------+--------+----------
ifort   |   1.9s |  18.2s
gfortran|   2.4s |  37.5s

This sample code writes a direct access binary file with 10M datasets of 12 bytes each.

PROGRAM btest
IMPLICIT NONE

INTEGER :: i

! IFORT
OPEN(11,FILE="test_d.bin",ACCESS="DIRECT",FORM="UNFORMATTED",RECL=3, &
& STATUS="REPLACE",BUFFERED="NO") ! ifort defines RECL as words
! GFORTRAN
!OPEN(11,FILE="test_d.bin",ACCESS="DIRECT",FORM="UNFORMATTED",RECL=12, &
!& STATUS="REPLACE") ! gfortran defines RECL as bytes

DO i = 1, 10000000
    WRITE(11,REC=i) i,i*1._8
END DO

CLOSE(11)

END PROGRAM

score 0 · Accepted Answer

尝试使用 StreamIO 参见http://www.star.le.ac.uk/~cgp/streamIO.html 这应该允许没有固定记录大小的随机访问，并且可能会导致使用相同的底层操作系统系统调用，从而有望获得相同的表现。

python - 读取 fortran 直接访问数据和写入格式化数据 - 使用 python 比使用 fortran 更快？

2 回答 2

Related

Reference