I want to write the following CUDA function:
void foo(int* a, size_t n)
{
if ( /* MAGIC 1 */ ) {
// a is known to be in shared memory,
// so use it directly
}
else {
// make a copy of a in shared memory
// and use the copy
}
}
On the host side, we have a slightly-related facility in the form of cudaPointerGetAttributes, which can tell us whether or not a pointer is to device memory or host memory; perhaps there's some way to distinguish pointers in device code as well, and perhaps it can also discern shared from global pointers. Alternatively, and perhaps even better - maybe there's a compile-time mechanism to do that, since, after all, the device functions are only compiled into kernels and are not freestanding, so nvcc
can often know whether they're used with shared memory or not.