我尝试使用 nvidia 发布的代码并进行内存带宽测试,但我得到了一些令人惊讶的结果
使用的程序在这里:https ://developer.nvidia.com/content/how-optimize-data-transfers-cuda-cc
在桌面上(使用 MacOS)
Device: GeForce GT 650M
Transfer size (MB): 16
Pageable transfers
Host to Device bandwidth (GB/s): 4.053219
Device to Host bandwidth (GB/s): 5.707841
Pinned transfers
Host to Device bandwidth (GB/s): 6.346621
Device to Host bandwidth (GB/s): 6.493052
在 Linux 服务器上:
Device: Tesla K20c
Transfer size (MB): 16
Pageable transfers
Host to Device bandwidth (GB/s): 1.482011
Device to Host bandwidth (GB/s): 1.621912
Pinned transfers
Host to Device bandwidth (GB/s): 1.480442
Device to Host bandwidth (GB/s): 1.667752
顺便说一句,我没有root权限..
我不知道为什么它在特斯拉设备上更少.. 谁能指出原因是什么?