cuda - 如何将上/下 gpuarray 转换为 cublasStbsv 所需的特定格式？

Question

我目前正在使用 pycuda 和 scikits.cuda 来求解线性方程 A*x = b，其中 A 是上/下矩阵。但是 cublasStbsv 例程需要特定的格式。举个例子：如果一个下矩阵A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]]，那么cublasStbsv需要的输入应该是[[1, 3 , 6], [2, 5, 0], [4, 0, 0]]，其中行分别是对角线、下对角线 1、下对角线 2。如果使用 numpy，这可以通过 stride_tricks.as_strided 轻松完成，但我不知道如何使用 pycuda.gpuarray 做类似的事情。任何帮助将不胜感激，谢谢。我找到了pycuda.compyte.array.as_strided，但它不能应用于gpuarray。

score 1 · Accepted Answer

我通过使用 theano 完成了它。首先将其转换为 cudandarray，更改步幅并将副本复制回 gpuarray。请注意 Fortran 和 C 顺序之间的更改。更新：终于通过使用 gpuarray.multi_take_put 完成了

def make_triangle(s_matrix, uplo = 'L'):
"""convert triangle matrix to the specific format
required by cublasStbsv, matrix should be in Fortran order,
s_matrix: gpuarray    
"""
#make sure the dytpe is float32     
if s_matrix.dtype != 'f':
    s_matrix = s_matrix.astype('f')
dim = s_matrix.shape[0]
if uplo == 'L':
    idx_tuple = np.tril_indices(dim)
    gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
    gdst = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * (dim - 1))
    return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]
else:
    idx_tuple = np.triu_indices(dim)
    gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
    gdst = gpuarray.to_gpu(idx_tuple[0] + (idx_tuple[1] + 1) * (dim - 1))
    return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]

cuda - 如何将上/下 gpuarray 转换为 cublasStbsv 所需的特定格式？

1 回答 1

Related

Reference