我正在使用 MPI 开发一个简单的并行应用程序,其中涉及将文件加载到内存中。该文件通过 NFS 导出到计算机集群的节点。我注意到在某些情况下,随着数千个额外的 TCP 数据包从服务器传输到客户端,NFS 的性能会显着下降,我已经将问题定位在代码中 fseek() 的使用上:
//Seek to data and load them to array
fseek ( fp, ( unsigned int ) dec_number + start, SEEK_SET );
for ( i = 0; i < n * mpi_n; i++ ) {
if ( ! feof ( fp ) )
text[i] = fgetc ( fp );
if ( i > 0 && n > mpi_n && i % mpi_n == 0 )
fseek ( fp, n - mpi_n, SEEK_CUR );
}
fclose ( fp );
由于没有 fseek() 的相同代码可以正常工作,因此服务器是否有可能在每次 fseek() 之后实际重新发送文件的一部分?如何提高这种性能?
Time with cold NFS cache, without fseek(): ~4 sec
Time with hot NFS cache, without fseek(): ~3 sec
Time with cold NFS cache, with fseek(): ~12 sec
Time with hot NFS cache, with fseek(): ~3 sec
具有 10 个节点的集群、具有冷 NFS 缓存和 fseek() 的 300MB 文件的 nfswatch 快照:
Total packets:
1903459 (network) 544803 (to host) 0 (dropped)
Packet counters:
NFS3 Read: 116290 21%
NFS3 Write: 10 0%
NFS Read: 0 0%
NFS Write: 0 0%
NFS Mount: 0 0%
Port Mapper: 0 0%
RPC Authorization: 29 0%
Other RPC Packets: 0 0%
TCP Packets: 544386 100%
UDP Packets: 17 0%
ICMP Packets: 0 0%
Routing Control: 0 0%
Address Resolution: 0 0%
Reverse Addr Resol: 0 0%
Ethernet Broadcast: 0 0%
Other Packets: 49 0%
具有 10 个节点的集群、具有冷 NFS 缓存且没有 fseek() 的 2GB 文件的 nfswatch 快照:
Total packets:
251804 (network) 102650 (to host) 0 (dropped)
Packet counters:
NFS3 Read: 37039 36%
NFS3 Write: 1 0%
NFS Read: 0 0%
NFS Write: 0 0%
NFS Mount: 0 0%
Port Mapper: 0 0%
RPC Authorization: 2 0%
Other RPC Packets: 0 0%
TCP Packets: 102543 100%
UDP Packets: 30 0%
ICMP Packets: 1 0%
Routing Control: 0 0%
Address Resolution: 0 0%
Reverse Addr Resol: 0 0%
Ethernet Broadcast: 0 0%
Other Packets: 41 0%
使用以下挂载命令挂载客户端:
/nfs on /nfs 类型 nfs (rw,rsize=8192,wsize=8192,timeo=14,intr)