在加载/存储期间访问共享内存时会发生银行冲突。当我使用如下所示的代码时:
__global__ void
bank_conf(const int* dev_a, int size) {
extern __shared__ int cache;
int tidx = blockIdx.x * blockDim.x + threadIdx.x;
cache[tidx * 2] += dev_a[tidx];
}
在我的程序中测试的一些指标的结果如下所示:
Section: Command line profiler metrics
---------------------------------------------------------------------- --------------- ------------------------------
l1tex__data_bank_conflicts_pipe_lsu_mem_global.avg 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global.max 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global.min 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global.sum 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global_op_st.avg 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global_op_st.max 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global_op_st.min 0
l1tex__data_bank_conflicts_pipe_lsu_mem_global_op_st.sum 0
l1tex__data_bank_conflicts_pipe_lsu_mem_shared.avg 0.05
l1tex__data_bank_conflicts_pipe_lsu_mem_shared.max 2
l1tex__data_bank_conflicts_pipe_lsu_mem_shared.min 0
l1tex__data_bank_conflicts_pipe_lsu_mem_shared.sum 2
---------------------------------------------------------------------- --------------- ------------------------------
谁能帮我解释为什么这些指标像l1tex__data_bank_conflicts_pipe_lsu_mem_global
andl1tex__data_bank_conflicts_pipe_lsu_mem_global_op_st
等。不起作用,以及如何理解像它们这样的指标?