0

我正在开发一个带有 tensorrt 和 python 的项目。不用担心这个问题可能不涉及任何 gpu 计算。我觉得在这种情况下,我们可以简单地将 tensorrt 视为 numpy,因为两个包在离开 python 时都会丢弃 GIL。我已经做了很多工作来使 tensorrt 部分正常工作。希望这个问题与 tensorrt 没有太大关系。首先,多线程确实适用于 tensorrt + python,因为 tensorrt 在执行它的主要执行功能时会丢弃 GIL。但我仍然发现在某些情况下它不起作用。仍然有并行工作,但每个线程都有很大的开销。它仅在执行时间足够长时才有效,因此开销可以忽略不计。虽然可能是正确的选择是切换到 c++,但我真的想进一步优化它并坚持使用 python。我设置了一个实验,看看当子线程数只有 1 时有多少开销。我使用 yappi 来分析实验。这是一些代码,这不是我的真实代码,但应该足够了。这里我没有展示run_trt_execution函数,因为我觉得没有必要,如果有人觉得你需要了解更多关于tensorrt的知识来回答我的问题,我也很乐意分享我对它的小知识。

yappi.start()
run_trt_execution(args)
yappi.stop()
yappi.start()
cur_thread = threading.Thread(target=run_trt_execution,
                                                  args = args)
cur_thread.start()
cur_thread.join()
yappi.stop()

这是配置,因为 yappi 记录了 cpu 和挂墙时间,所以我做了实验以获得两种设置的两次这里是可能太长的日志:

cpu multithread profile ______________________________+++++++++++++++++++++++++++++++++++++++==
name                                  ncall  tsub      ttot      tavg
..hon3.8/threading.py:859 Thread.run  1      0.000005  0.008511  0.008511
..tc_multithread.py:44 run_inference  1      0.006554  0.008506  0.008506
..da/cuda.py:244 DeviceArray.copy_to  1      0.000026  0.001400  0.001400
..rter.py:159 LazyModule.__getattr__  5      0.000014  0.001319  0.000264
..aphy/mod/importer.py:90 import_mod  5      0.000035  0.001301  0.000260
..r/logger.py:375 Logger.module_info  5      0.000024  0.000893  0.000179
..y:360 Logger._str_from_module_info  5      0.000018  0.000819  0.000164
..hy/logger/logger.py:363 try_append  15     0.000023  0.000802  0.000053
..aphy/logger/logger.py:372 <lambda>  5      0.000017  0.000753  0.000151
..ython3.8/posixpath.py:387 realpath  5      0.000020  0.000726  0.000145
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000013  0.000616  0.000308
..3.8/posixpath.py:396 _joinrealpath  5      0.000093  0.000564  0.000113
../cuda.py:368 DeviceArray.copy_from  1      0.000022  0.000521  0.000521
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000507  0.000512  0.000256
..tlib/__init__.py:109 import_module  5      0.000016  0.000326  0.000065
..rtlib._bootstrap>:1002 _gcd_import  5      0.000013  0.000307  0.000061
..lib._bootstrap>:986 _find_and_load  5      0.000045  0.000283  0.000057
..uda/cuda.py:352 DeviceArray.resize  1      0.000006  0.000281  0.000281
..lib/python3.8/posixpath.py:71 join  30     0.000115  0.000211  0.000007
../python3.8/posixpath.py:164 islink  30     0.000066  0.000209  0.000007
..python3.8/posixpath.py:372 abspath  5      0.000015  0.000140  0.000028
..ython3.8/posixpath.py:334 normpath  5      0.000057  0.000097  0.000019
..b._bootstrap>:157 _get_module_lock  10     0.000045  0.000096  0.000010
..bootstrap>:194 _lock_unlock_module  5      0.000015  0.000083  0.000017
..>:147 _ModuleLockManager.__enter__  5      0.000013  0.000082  0.000016
..._bootstrap>:1017 _handle_fromlist  15     0.000048  0.000074  0.000005
..python3.8/posixpath.py:41 _get_sep  40     0.000043  0.000064  0.000002
..ib/python3.8/posixpath.py:60 isabs  10     0.000026  0.000053  0.000005
..hy/logger/logger.py:207 Logger.log  5      0.000018  0.000049  0.000010
<frozen importlib._bootstrap>:176 cb  10     0.000027  0.000043  0.000004
..bootstrap>:58 _ModuleLock.__init__  10     0.000022  0.000033  0.000003
.._bootstrap>:78 _ModuleLock.acquire  10     0.000024  0.000029  0.000003
..bootstrap>:103 _ModuleLock.release  10     0.000020  0.000025  0.000003
..p>:151 _ModuleLockManager.__exit__  5      0.000008  0.000022  0.000004
..aphy/logger/logger.py:370 <lambda>  5      0.000008  0.000016  0.000003
..tlib._bootstrap>:937 _sanity_check  5      0.000008  0.000011  0.000002
..aphy/logger/logger.py:371 <lambda>  5      0.000006  0.000010  0.000002
../_internal.py:250 _ctypes.__init__  2      0.000005  0.000005  0.000002
..p>:143 _ModuleLockManager.__init__  5      0.000004  0.000004  0.000001
..polygraphy/util/util.py:499 volume  3      0.000004  0.000004  0.000001
..hy/logger/logger.py:276 should_log  5      0.000004  0.000004  0.000001
..0 DeviceArray._check_dtype_matches  2      0.000003  0.000003  0.000002
..olygraphy/cuda/cuda.py:24 void_ptr  4      0.000003  0.000003  0.000001
..ygraphy/cuda/cuda.py:59 Cuda.check  2      0.000002  0.000002  0.000001
..core/_internal.py:304 _ctypes.data  2      0.000002  0.000002  0.000001
..olygraphy/cuda/cuda.py:149 wrapper  2      0.000001  0.000001  0.000001
../cuda.py:203 try_get_stream_handle  2      0.000001  0.000001  0.000001
Function stats for (_MainThread) (0)
Clock type: CPU
Ordered by: totaltime, desc
name                                  ncall  tsub      ttot      tavg
..hon3.8/threading.py:859 Thread.run  1      0.000005  0.008511  0.008511
..tc_multithread.py:44 run_inference  1      0.006554  0.008506  0.008506
..da/cuda.py:244 DeviceArray.copy_to  1      0.000026  0.001400  0.001400
..rter.py:159 LazyModule.__getattr__  5      0.000014  0.001319  0.000264
..aphy/mod/importer.py:90 import_mod  5      0.000035  0.001301  0.000260
..r/logger.py:375 Logger.module_info  5      0.000024  0.000893  0.000179
..y:360 Logger._str_from_module_info  5      0.000018  0.000819  0.000164
..hy/logger/logger.py:363 try_append  15     0.000023  0.000802  0.000053
..aphy/logger/logger.py:372 <lambda>  5      0.000017  0.000753  0.000151
..ython3.8/posixpath.py:387 realpath  5      0.000020  0.000726  0.000145
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000013  0.000616  0.000308
..3.8/posixpath.py:396 _joinrealpath  5      0.000093  0.000564  0.000113
../cuda.py:368 DeviceArray.copy_from  1      0.000022  0.000521  0.000521
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000507  0.000512  0.000256
..tlib/__init__.py:109 import_module  5      0.000016  0.000326  0.000065
..rtlib._bootstrap>:1002 _gcd_import  5      0.000013  0.000307  0.000061
..lib._bootstrap>:986 _find_and_load  5      0.000045  0.000283  0.000057
..uda/cuda.py:352 DeviceArray.resize  1      0.000006  0.000281  0.000281
..lib/python3.8/posixpath.py:71 join  30     0.000115  0.000211  0.000007
../python3.8/posixpath.py:164 islink  30     0.000066  0.000209  0.000007
..python3.8/posixpath.py:372 abspath  5      0.000015  0.000140  0.000028
..n3.8/threading.py:834 Thread.start  1      0.000011  0.000099  0.000099
..ython3.8/posixpath.py:334 normpath  5      0.000057  0.000097  0.000019
..b._bootstrap>:157 _get_module_lock  10     0.000045  0.000096  0.000010
..bootstrap>:194 _lock_unlock_module  5      0.000015  0.000083  0.000017
..>:147 _ModuleLockManager.__enter__  5      0.000013  0.000082  0.000016
..._bootstrap>:1017 _handle_fromlist  15     0.000048  0.000074  0.000005
..python3.8/posixpath.py:41 _get_sep  40     0.000043  0.000064  0.000002
..on3.8/threading.py:979 Thread.join  1      0.000006  0.000054  0.000054
..hon3.8/threading.py:540 Event.wait  1      0.000008  0.000053  0.000053
..ib/python3.8/posixpath.py:60 isabs  10     0.000026  0.000053  0.000005
..hy/logger/logger.py:207 Logger.log  5      0.000018  0.000049  0.000010
..:1017 Thread._wait_for_tstate_lock  1      0.000012  0.000046  0.000046
<frozen importlib._bootstrap>:176 cb  10     0.000027  0.000043  0.000004
...8/threading.py:270 Condition.wait  1      0.000012  0.000038  0.000038
..8/threading.py:761 Thread.__init__  1      0.000015  0.000036  0.000036
..bootstrap>:58 _ModuleLock.__init__  10     0.000022  0.000033  0.000003
.._bootstrap>:78 _ModuleLock.acquire  10     0.000024  0.000029  0.000003
..bootstrap>:103 _ModuleLock.release  10     0.000020  0.000025  0.000003
..p>:151 _ModuleLockManager.__exit__  5      0.000008  0.000022  0.000004
..n3.8/threading.py:944 Thread._stop  1      0.000016  0.000021  0.000021
..aphy/logger/logger.py:370 <lambda>  5      0.000008  0.000016  0.000003
..tlib._bootstrap>:937 _sanity_check  5      0.000008  0.000011  0.000002
..aphy/logger/logger.py:371 <lambda>  5      0.000006  0.000010  0.000002
...8/threading.py:505 Event.__init__  1      0.000004  0.000009  0.000009
..ing.py:255 Condition._release_save  1      0.000007  0.000008  0.000008
../_internal.py:250 _ctypes.__init__  2      0.000005  0.000005  0.000002
..n3.8/_weakrefset.py:81 WeakSet.add  1      0.000004  0.000005  0.000005
..8/threading.py:1306 current_thread  2      0.000004  0.000005  0.000002
..reading.py:246 Condition.__enter__  1      0.000003  0.000004  0.000004
..p>:143 _ModuleLockManager.__init__  5      0.000004  0.000004  0.000001
..polygraphy/util/util.py:499 volume  3      0.000004  0.000004  0.000001
..hreading.py:222 Condition.__init__  1      0.000004  0.000004  0.000004
..hy/logger/logger.py:276 should_log  5      0.000004  0.000004  0.000001
..0 DeviceArray._check_dtype_matches  2      0.000003  0.000003  0.000002
..reading.py:1095 _MainThread.daemon  2      0.000003  0.000003  0.000001
..hreading.py:249 Condition.__exit__  1      0.000002  0.000003  0.000003
..ython3.8/_weakrefset.py:38 _remove  1      0.000002  0.000003  0.000003
..olygraphy/cuda/cuda.py:24 void_ptr  4      0.000003  0.000003  0.000001
..reading.py:261 Condition._is_owned  1      0.000002  0.000003  0.000003
..ython3.8/threading.py:734 _newname  1      0.000003  0.000003  0.000003
...py:258 Condition._acquire_restore  1      0.000001  0.000002  0.000002
..ygraphy/cuda/cuda.py:59 Cuda.check  2      0.000002  0.000002  0.000001
..core/_internal.py:304 _ctypes.data  2      0.000002  0.000002  0.000001
..ng.py:1177 _make_invoke_excepthook  1      0.000001  0.000001  0.000001
..olygraphy/cuda/cuda.py:149 wrapper  2      0.000001  0.000001  0.000001
../cuda.py:203 try_get_stream_handle  2      0.000001  0.000001  0.000001
..n3.8/threading.py:513 Event.is_set  2      0.000001  0.000001  0.000000
wall multithread profile________________________________++++++++++++++++++++++++++++++++==
Clock type: WALL
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..hon3.8/threading.py:859 Thread.run  1      0.000005  0.009640  0.009640
..tc_multithread.py:44 run_inference  1      0.008012  0.009635  0.009635
..on3.8/threading.py:979 Thread.join  1      0.000004  0.009415  0.009415
..:1017 Thread._wait_for_tstate_lock  1      0.000009  0.009408  0.009408
..rter.py:159 LazyModule.__getattr__  5      0.000011  0.000975  0.000195
..aphy/mod/importer.py:90 import_mod  5      0.000030  0.000963  0.000193
..da/cuda.py:244 DeviceArray.copy_to  1      0.000022  0.000955  0.000955
..r/logger.py:375 Logger.module_info  5      0.000022  0.000681  0.000136
../cuda.py:368 DeviceArray.copy_from  1      0.000032  0.000628  0.000628
..y:360 Logger._str_from_module_info  5      0.000016  0.000623  0.000125
..hy/logger/logger.py:363 try_append  15     0.000010  0.000607  0.000040
..aphy/logger/logger.py:372 <lambda>  5      0.000016  0.000577  0.000115
..ython3.8/posixpath.py:387 realpath  5      0.000013  0.000550  0.000110
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000513  0.000518  0.000259
..3.8/posixpath.py:396 _joinrealpath  5      0.000062  0.000447  0.000089
..uda/cuda.py:352 DeviceArray.resize  1      0.000008  0.000365  0.000365
..n3.8/threading.py:834 Thread.start  1      0.000015  0.000328  0.000328
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000009  0.000319  0.000160
..hon3.8/threading.py:540 Event.wait  1      0.000009  0.000277  0.000277
...8/threading.py:270 Condition.wait  1      0.000016  0.000260  0.000260
../python3.8/posixpath.py:164 islink  30     0.000038  0.000227  0.000008
..tlib/__init__.py:109 import_module  5      0.000012  0.000221  0.000044
..rtlib._bootstrap>:1002 _gcd_import  5      0.000010  0.000207  0.000041
..lib._bootstrap>:986 _find_and_load  5      0.000033  0.000191  0.000038
..lib/python3.8/posixpath.py:71 join  30     0.000074  0.000126  0.000004
..python3.8/posixpath.py:372 abspath  5      0.000008  0.000087  0.000017
..b._bootstrap>:157 _get_module_lock  10     0.000040  0.000067  0.000007
..>:147 _ModuleLockManager.__enter__  5      0.000009  0.000065  0.000013
..ython3.8/posixpath.py:334 normpath  5      0.000042  0.000062  0.000012
..bootstrap>:194 _lock_unlock_module  5      0.000009  0.000049  0.000010
..._bootstrap>:1017 _handle_fromlist  15     0.000034  0.000047  0.000003
..8/threading.py:761 Thread.__init__  1      0.000018  0.000042  0.000042
..python3.8/posixpath.py:41 _get_sep  40     0.000027  0.000037  0.000001
..hy/logger/logger.py:207 Logger.log  5      0.000016  0.000036  0.000007
..ib/python3.8/posixpath.py:60 isabs  10     0.000018  0.000030  0.000003
<frozen importlib._bootstrap>:176 cb  10     0.000018  0.000025  0.000002
.._bootstrap>:78 _ModuleLock.acquire  10     0.000018  0.000023  0.000002
..bootstrap>:58 _ModuleLock.__init__  10     0.000017  0.000022  0.000002
..bootstrap>:103 _ModuleLock.release  10     0.000014  0.000016  0.000002
..p>:151 _ModuleLockManager.__exit__  5      0.000006  0.000016  0.000003
..aphy/logger/logger.py:370 <lambda>  5      0.000006  0.000013  0.000003
...8/threading.py:505 Event.__init__  1      0.000006  0.000011  0.000011
..n3.8/threading.py:944 Thread._stop  1      0.000006  0.000008  0.000008
..aphy/logger/logger.py:371 <lambda>  5      0.000006  0.000007  0.000001
../_internal.py:250 _ctypes.__init__  2      0.000006  0.000006  0.000003
..tlib._bootstrap>:937 _sanity_check  5      0.000005  0.000006  0.000001
..n3.8/_weakrefset.py:81 WeakSet.add  1      0.000005  0.000005  0.000005
..hreading.py:222 Condition.__init__  1      0.000005  0.000005  0.000005
..reading.py:246 Condition.__enter__  1      0.000004  0.000005  0.000005
..8/threading.py:1306 current_thread  2      0.000004  0.000005  0.000002
..hy/logger/logger.py:276 should_log  5      0.000004  0.000004  0.000001
..ython3.8/_weakrefset.py:38 _remove  1      0.000004  0.000004  0.000004
..olygraphy/cuda/cuda.py:24 void_ptr  4      0.000003  0.000003  0.000001
..polygraphy/util/util.py:499 volume  3      0.000003  0.000003  0.000001
..0 DeviceArray._check_dtype_matches  2      0.000003  0.000003  0.000002
..p>:143 _ModuleLockManager.__init__  5      0.000003  0.000003  0.000001
..ing.py:255 Condition._release_save  1      0.000003  0.000003  0.000003
..reading.py:261 Condition._is_owned  1      0.000002  0.000003  0.000003
..hreading.py:249 Condition.__exit__  1      0.000003  0.000003  0.000003
..ython3.8/threading.py:734 _newname  1      0.000003  0.000003  0.000003
..ygraphy/cuda/cuda.py:59 Cuda.check  2      0.000002  0.000002  0.000001
...py:258 Condition._acquire_restore  1      0.000001  0.000002  0.000002
..reading.py:1095 _MainThread.daemon  2      0.000002  0.000002  0.000001
..core/_internal.py:304 _ctypes.data  2      0.000001  0.000001  0.000000
..olygraphy/cuda/cuda.py:149 wrapper  2      0.000001  0.000001  0.000000
..n3.8/threading.py:513 Event.is_set  2      0.000001  0.000001  0.000000
..ng.py:1177 _make_invoke_excepthook  1      0.000001  0.000001  0.000001
../cuda.py:203 try_get_stream_handle  2      0.000000  0.000000  0.000000
Function stats for (Thread) (1)

Clock type: WALL
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..hon3.8/threading.py:859 Thread.run  1      0.000005  0.009640  0.009640
..tc_multithread.py:44 run_inference  1      0.008012  0.009635  0.009635
..rter.py:159 LazyModule.__getattr__  5      0.000011  0.000975  0.000195
..aphy/mod/importer.py:90 import_mod  5      0.000030  0.000963  0.000193
..da/cuda.py:244 DeviceArray.copy_to  1      0.000022  0.000955  0.000955
..r/logger.py:375 Logger.module_info  5      0.000022  0.000681  0.000136
../cuda.py:368 DeviceArray.copy_from  1      0.000032  0.000628  0.000628
..y:360 Logger._str_from_module_info  5      0.000016  0.000623  0.000125
..hy/logger/logger.py:363 try_append  15     0.000010  0.000607  0.000040
..aphy/logger/logger.py:372 <lambda>  5      0.000016  0.000577  0.000115
..ython3.8/posixpath.py:387 realpath  5      0.000013  0.000550  0.000110
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000513  0.000518  0.000259
..3.8/posixpath.py:396 _joinrealpath  5      0.000062  0.000447  0.000089
..uda/cuda.py:352 DeviceArray.resize  1      0.000008  0.000365  0.000365
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000009  0.000319  0.000160
../python3.8/posixpath.py:164 islink  30     0.000038  0.000227  0.000008
..tlib/__init__.py:109 import_module  5      0.000012  0.000221  0.000044
..rtlib._bootstrap>:1002 _gcd_import  5      0.000010  0.000207  0.000041
..lib._bootstrap>:986 _find_and_load  5      0.000033  0.000191  0.000038
..lib/python3.8/posixpath.py:71 join  30     0.000074  0.000126  0.000004
..python3.8/posixpath.py:372 abspath  5      0.000008  0.000087  0.000017
..b._bootstrap>:157 _get_module_lock  10     0.000040  0.000067  0.000007
..>:147 _ModuleLockManager.__enter__  5      0.000009  0.000065  0.000013
..ython3.8/posixpath.py:334 normpath  5      0.000042  0.000062  0.000012
..bootstrap>:194 _lock_unlock_module  5      0.000009  0.000049  0.000010
..._bootstrap>:1017 _handle_fromlist  15     0.000034  0.000047  0.000003
..python3.8/posixpath.py:41 _get_sep  40     0.000027  0.000037  0.000001
..hy/logger/logger.py:207 Logger.log  5      0.000016  0.000036  0.000007
..ib/python3.8/posixpath.py:60 isabs  10     0.000018  0.000030  0.000003
<frozen importlib._bootstrap>:176 cb  10     0.000018  0.000025  0.000002
.._bootstrap>:78 _ModuleLock.acquire  10     0.000018  0.000023  0.000002
..bootstrap>:58 _ModuleLock.__init__  10     0.000017  0.000022  0.000002
..bootstrap>:103 _ModuleLock.release  10     0.000014  0.000016  0.000002
..p>:151 _ModuleLockManager.__exit__  5      0.000006  0.000016  0.000003
..aphy/logger/logger.py:370 <lambda>  5      0.000006  0.000013  0.000003
..aphy/logger/logger.py:371 <lambda>  5      0.000006  0.000007  0.000001
../_internal.py:250 _ctypes.__init__  2      0.000006  0.000006  0.000003
..tlib._bootstrap>:937 _sanity_check  5      0.000005  0.000006  0.000001
..hy/logger/logger.py:276 should_log  5      0.000004  0.000004  0.000001
..olygraphy/cuda/cuda.py:24 void_ptr  4      0.000003  0.000003  0.000001
..polygraphy/util/util.py:499 volume  3      0.000003  0.000003  0.000001
..0 DeviceArray._check_dtype_matches  2      0.000003  0.000003  0.000002
..p>:143 _ModuleLockManager.__init__  5      0.000003  0.000003  0.000001
..ygraphy/cuda/cuda.py:59 Cuda.check  2      0.000002  0.000002  0.000001
..core/_internal.py:304 _ctypes.data  2      0.000001  0.000001  0.000000
..olygraphy/cuda/cuda.py:149 wrapper  2      0.000001  0.000001  0.000000
../cuda.py:203 try_get_stream_handle  2      0.000000  0.000000  0.000000

one thread cpu time ____________________________++++++++++++++++++++++++++++++++++++++++++++++++++++
name                                  ncall  tsub      ttot      tavg      
..tc_multithread.py:44 run_inference  1      0.005786  0.007297  0.007297
..da/cuda.py:244 DeviceArray.copy_to  1      0.000016  0.001010  0.001010
..rter.py:159 LazyModule.__getattr__  5      0.000010  0.000940  0.000188
..aphy/mod/importer.py:90 import_mod  5      0.000025  0.000928  0.000186
..r/logger.py:375 Logger.module_info  5      0.000018  0.000621  0.000124
..y:360 Logger._str_from_module_info  5      0.000014  0.000568  0.000114
..hy/logger/logger.py:363 try_append  15     0.000018  0.000555  0.000037
..aphy/logger/logger.py:372 <lambda>  5      0.000013  0.000517  0.000103
..ython3.8/posixpath.py:387 realpath  5      0.000014  0.000497  0.000099
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000475  0.000479  0.000239
../cuda.py:368 DeviceArray.copy_from  1      0.000021  0.000474  0.000474
..3.8/posixpath.py:396 _joinrealpath  5      0.000068  0.000383  0.000077
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000007  0.000377  0.000188
..tlib/__init__.py:109 import_module  5      0.000012  0.000250  0.000050
..uda/cuda.py:352 DeviceArray.resize  1      0.000006  0.000247  0.000247
..rtlib._bootstrap>:1002 _gcd_import  5      0.000016  0.000236  0.000047
..lib._bootstrap>:986 _find_and_load  5      0.000035  0.000213  0.000043
..lib/python3.8/posixpath.py:71 join  30     0.000081  0.000150  0.000005
../python3.8/posixpath.py:164 islink  30     0.000048  0.000129  0.000004
..python3.8/posixpath.py:372 abspath  5      0.000011  0.000097  0.000019
..ython3.8/posixpath.py:334 normpath  5      0.000040  0.000066  0.000013
..>:147 _ModuleLockManager.__enter__  5      0.000012  0.000066  0.000013
..b._bootstrap>:157 _get_module_lock  10     0.000033  0.000065  0.000006
..bootstrap>:194 _lock_unlock_module  5      0.000011  0.000057  0.000011
..._bootstrap>:1017 _handle_fromlist  15     0.000032  0.000050  0.000003
..python3.8/posixpath.py:41 _get_sep  40     0.000031  0.000046  0.000001
..ib/python3.8/posixpath.py:60 isabs  10     0.000019  0.000038  0.000004
..hy/logger/logger.py:207 Logger.log  5      0.000013  0.000035  0.000007
<frozen importlib._bootstrap>:176 cb  10     0.000021  0.000033  0.000003
.._bootstrap>:78 _ModuleLock.acquire  10     0.000022  0.000026  0.000003
..bootstrap>:58 _ModuleLock.__init__  10     0.000016  0.000024  0.000002
..bootstrap>:103 _ModuleLock.release  10     0.000016  0.000020  0.000002
..p>:151 _ModuleLockManager.__exit__  5      0.000007  0.000018  0.000004
..aphy/logger/logger.py:370 <lambda>  5      0.000007  0.000012  0.000002
..uda/cuda.py:196 Stream.synchronize  1      0.000003  0.000008  0.000008
..aphy/logger/logger.py:371 <lambda>  5      0.000005  0.000008  0.000002
..tlib._bootstrap>:937 _sanity_check  5      0.000005  0.000007  0.000001
..cuda.py:76 Cuda.stream_synchronize  1      0.000004  0.000005  0.000005
../_internal.py:250 _ctypes.__init__  2      0.000004  0.000004  0.000002
..p>:143 _ModuleLockManager.__init__  5      0.000003  0.000003  0.000001
..olygraphy/cuda/cuda.py:24 void_ptr  5      0.000003  0.000003  0.000001
..hy/logger/logger.py:276 should_log  5      0.000003  0.000003  0.000001
..polygraphy/util/util.py:499 volume  3      0.000002  0.000002  0.000001
..0 DeviceArray._check_dtype_matches  2      0.000002  0.000002  0.000001
..ygraphy/cuda/cuda.py:59 Cuda.check  3      0.000002  0.000002  0.000001
..olygraphy/cuda/cuda.py:149 wrapper  3      0.000001  0.000001  0.000000
..core/_internal.py:304 _ctypes.data  2      0.000001  0.000001  0.000001
../cuda.py:203 try_get_stream_handle  2      0.000001  0.000001  0.000000

one thread wall time__________________++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Clock type: WALL
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..tc_multithread.py:44 run_inference  1      0.005838  0.006989  0.006989
..da/cuda.py:244 DeviceArray.copy_to  1      0.000013  0.000788  0.000788
..rter.py:159 LazyModule.__getattr__  5      0.000005  0.000601  0.000120
..aphy/mod/importer.py:90 import_mod  5      0.000020  0.000593  0.000119
..raphy/cuda/cuda.py:118 Cuda.memcpy  2      0.000470  0.000472  0.000236
..r/logger.py:375 Logger.module_info  5      0.000013  0.000389  0.000078
..y:360 Logger._str_from_module_info  5      0.000010  0.000355  0.000071
..hy/logger/logger.py:363 try_append  15     0.000008  0.000345  0.000023
../cuda.py:368 DeviceArray.copy_from  1      0.000022  0.000340  0.000340
..aphy/logger/logger.py:372 <lambda>  5      0.000010  0.000325  0.000065
..ython3.8/posixpath.py:387 realpath  5      0.000011  0.000309  0.000062
..3.8/posixpath.py:396 _joinrealpath  5      0.000043  0.000244  0.000049
..uda/cuda.py:237 DeviceArray.nbytes  2      0.000006  0.000227  0.000113
..uda/cuda.py:352 DeviceArray.resize  1      0.000006  0.000184  0.000184
..tlib/__init__.py:109 import_module  5      0.000006  0.000167  0.000033
..rtlib._bootstrap>:1002 _gcd_import  5      0.000010  0.000160  0.000032
..lib._bootstrap>:986 _find_and_load  5      0.000039  0.000149  0.000030
../python3.8/posixpath.py:164 islink  30     0.000027  0.000097  0.000003
..lib/python3.8/posixpath.py:71 join  30     0.000050  0.000078  0.000003
..python3.8/posixpath.py:372 abspath  5      0.000004  0.000052  0.000010
..>:147 _ModuleLockManager.__enter__  5      0.000008  0.000045  0.000009
..b._bootstrap>:157 _get_module_lock  10     0.000023  0.000042  0.000004
..ython3.8/posixpath.py:334 normpath  5      0.000027  0.000037  0.000007
..bootstrap>:194 _lock_unlock_module  5      0.000006  0.000033  0.000007
..._bootstrap>:1017 _handle_fromlist  15     0.000018  0.000027  0.000002
..ib/python3.8/posixpath.py:60 isabs  10     0.000012  0.000021  0.000002
..python3.8/posixpath.py:41 _get_sep  40     0.000016  0.000021  0.000001
..hy/logger/logger.py:207 Logger.log  5      0.000009  0.000021  0.000004
.._bootstrap>:78 _ModuleLock.acquire  10     0.000015  0.000017  0.000002
..bootstrap>:58 _ModuleLock.__init__  10     0.000012  0.000016  0.000002
<frozen importlib._bootstrap>:176 cb  10     0.000009  0.000015  0.000001
..bootstrap>:103 _ModuleLock.release  10     0.000010  0.000013  0.000001
..p>:151 _ModuleLockManager.__exit__  5      0.000004  0.000012  0.000002
..aphy/logger/logger.py:370 <lambda>  5      0.000002  0.000007  0.000001
..aphy/logger/logger.py:371 <lambda>  5      0.000004  0.000005  0.000001
..uda/cuda.py:196 Stream.synchronize  1      0.000001  0.000005  0.000005
..cuda.py:76 Cuda.stream_synchronize  1      0.000004  0.000004  0.000004
..p>:143 _ModuleLockManager.__init__  5      0.000003  0.000003  0.000001
../_internal.py:250 _ctypes.__init__  2      0.000002  0.000002  0.000001
..olygraphy/cuda/cuda.py:24 void_ptr  5      0.000002  0.000002  0.000000
..hy/logger/logger.py:276 should_log  5      0.000002  0.000002  0.000000
..polygraphy/util/util.py:499 volume  3      0.000001  0.000001  0.000000
..0 DeviceArray._check_dtype_matches  2      0.000001  0.000001  0.000000
..tlib._bootstrap>:937 _sanity_check  5      0.000001  0.000001  0.000000
..ygraphy/cuda/cuda.py:59 Cuda.check  3      0.000000  0.000000  0.000000
..core/_internal.py:304 _ctypes.data  2      0.000000  0.000000  0.000000
../cuda.py:203 try_get_stream_handle  2      0.000000  0.000000  0.000000
..olygraphy/cuda/cuda.py:149 wrapper  3      0.000000  0.000000  0.000000

您可以看到线程版本的所有内容都较慢。我只是想知道为什么会这样。

4

0 回答 0