cuda - push::min_element 在 float4 device_vector 上不起作用，而在 host_vector 上起作用

Question

我正在尝试使用 Thrust 和 CUDA 找到数组中的最小值。
以下设备示例返回 0 ：

thrust::device_vector<float4>::iterator it =  thrust::min_element(IntsOnDev.begin(),IntsOnDev.end(),equalOperator());       
int pos = it - IntsOnDev.begin();

但是，此主机版本完美运行：

thrust::host_vector<float4>arr = IntsOnDev;
thrust::host_vector<float4>::iterator it2 =  thrust::min_element(arr.begin(),arr.end(),equalOperator());
int pos2 = it2 - arr.begin();

比较器类型：

struct equalOperator
{
  __host__ __device__
    bool operator()(const float4 x,const float4 y) const
    {
        return ( x.w < y.w );
    }
};

我只是想添加该thrust::sort 与相同的谓词一起使用。

score 5 · Accepted Answer

不幸的是，nvcc对于某些对齐类型的大小，一些主机编译器（如果我没记错的话，是一些 64 位版本的 MSVC）不同意。float4就是其中之一。这通常会导致未定义的行为。

解决方法是使用没有对齐的类型，例如my_float4：

struct my_float4
{
  float x, y, z, w;
};

cuda - push::min_element 在 float4 device_vector 上不起作用，而在 host_vector 上起作用

1 回答 1

Related

Reference