pytorch - PyTorch 总 CUDA 时间

Question

Autograd Profiler 是一个方便的工具，用于测量 PyTorch 中的执行时间，如下所示：

import torch
import torchvision.models as models

model = models.densenet121(pretrained=True)
x = torch.randn((1, 3, 224, 224), requires_grad=True)

with torch.autograd.profiler.profile(use_cuda=True) as prof:
    model(x)
print(prof)

输出如下所示：

-----------------------------------  ---------------  ---------------  ---------------  ---------------  ---------------
Name                                        CPU time        CUDA time            Calls        CPU total       CUDA total
-----------------------------------  ---------------  ---------------  ---------------  ---------------  ---------------
conv2d                                    9976.544us       9972.736us                1       9976.544us       9972.736us
convolution                               9958.778us       9958.400us                1       9958.778us       9958.400us
_convolution                              9946.712us       9947.136us                1       9946.712us       9947.136us
contiguous                                   6.692us          6.976us                1          6.692us          6.976us
empty                                       11.927us         12.032us                1         11.927us         12.032us

这将包括许多行。我的问题是：

1) 如何使用 autograd profiler 获取整个 CUDA 时间？（即，CUDA 时间列的总和）

2）有什么解决方案可以务实地使用它吗？例如，prof[0].CUDA_Time？

score 2 · Accepted Answer

[item.cuda_time for item in prof.function_events]

会给你一个 CUDA 时间列表。根据您的需要对其进行修改。例如，要获得 CUDA 时间的总和：

sum([item.cuda_time for item in prof.function_events])

但请注意，列表中的时间以微秒为单位，而在print(prof).

pytorch - PyTorch 总 CUDA 时间

1 回答 1

Related

Reference