1

我正在使用 MPI 开发一个简单的并行应用程序,其中涉及将文件加载到内存中。该文件通过 NFS 导出到计算机集群的节点。我注意到在某些情况下,随着数千个额外的 TCP 数据包从服务器传输到客户端,NFS 的性能会显着下降,我已经将问题定位在代码中 fseek() 的使用上:

//Seek to data and load them to array
fseek ( fp, ( unsigned int ) dec_number + start, SEEK_SET );

for ( i = 0; i < n * mpi_n; i++ ) {
    if ( ! feof ( fp ) )
        text[i] = fgetc ( fp );

    if ( i > 0 && n > mpi_n && i % mpi_n == 0 )
        fseek ( fp, n - mpi_n, SEEK_CUR );
}
fclose ( fp );

由于没有 fseek() 的相同代码可以正常工作,因此服务器是否有可能在每次 fseek() 之后实际重新发送文件的一部分?如何提高这种性能?

Time with cold NFS cache, without fseek(): ~4 sec
Time with hot NFS cache, without fseek(): ~3 sec
Time with cold NFS cache, with fseek(): ~12 sec
Time with hot NFS cache, with fseek(): ~3 sec

具有 10 个节点的集群、具有冷 NFS 缓存和 fseek() 的 300MB 文件的 nfswatch 快照:

Total packets:
1903459 (network)   544803 (to host)        0 (dropped)

Packet counters:
NFS3 Read:                  116290      21%
NFS3 Write:                     10       0%
NFS Read:                        0       0%
NFS Write:                       0       0%
NFS Mount:                       0       0%
Port Mapper:                     0       0%
RPC Authorization:              29       0%
Other RPC Packets:               0       0%

TCP Packets:                544386     100%
UDP Packets:                    17       0%
ICMP Packets:                    0       0%
Routing Control:                 0       0%
Address Resolution:              0       0%
Reverse Addr Resol:              0       0%
Ethernet Broadcast:              0       0%
Other Packets:                  49       0%

具有 10 个节点的集群、具有冷 NFS 缓存且没有 fseek() 的 2GB 文件的 nfswatch 快照:

Total packets:
251804 (network)   102650 (to host)        0 (dropped)

Packet counters:
NFS3 Read:                   37039      36%
NFS3 Write:                      1       0%
NFS Read:                        0       0%
NFS Write:                       0       0%
NFS Mount:                       0       0%
Port Mapper:                     0       0%
RPC Authorization:               2       0%
Other RPC Packets:               0       0%

TCP Packets:                102543     100%
UDP Packets:                    30       0%
ICMP Packets:                    1       0%
Routing Control:                 0       0%
Address Resolution:              0       0%
Reverse Addr Resol:              0       0%
Ethernet Broadcast:              0       0%
Other Packets:                  41       0%

使用以下挂载命令挂载客户端:

/nfs on /nfs 类型 nfs (rw,rsize=8192,wsize=8192,timeo=14,intr)

4

0 回答 0