c++ - CUDA - determine number of banks in shared memory

Question

Shared memory is "striped" into banks. This leads to the whole issue of bank conflicts, as we all know.

Question: But how can you determine how many banks ("stripes") exist in shared memory?

(Poking around NVIDIA "devtalk" forums, it seems that per-block shared memory is "striped" into 16 banks. But how do we know this? The threads suggesting this are a few years old. Have things changed? Is it fixed on all NVIDIA CUDA-capable cards? Is there a way to determine this from the runtime API (I don't see it there, e.g. under cudaDeviceProp)? Is there a manual way to determine it at runtime?)

score 10 · Accepted Answer

正如@RobertHarvey 所说，它已记录在案。编程指南指出计算能力 1.x有 16 个库，计算能力 2.x和3.x有 32 个库。因此，您可以根据设备属性中返回的计算能力（主要版本）做出任何决定。

cuda 在线文档的一般链接包含在cuda 标签的信息链接中。

c++ - CUDA - determine number of banks in shared memory

1 回答 1

Related

Reference