parallel-processing - 在 ArrayFire 中使用带有 double2 数组的 seq 时降级为 float2

Question

我正在使用以下利用 ArrayFire 库的测试代码。

void test_seq(const array& input, array& output, const int N)
{
    array test      = seq(0,N-1);                                         
    output          = input;
}

(for the moment `array test` has no role)

double2* test_CPU; test_CPU=(double2*)malloc(10*sizeof(double2));       
for (int k=0; k<10; k++) { test_CPU[k].x=2.; test_CPU[k].y=1.; }
array test_GPU(10, test_CPU);
array test_GPU_output = constant(0.,10, c64);
test_seq(test_GPU,test_GPU_output,10);
print(test_GPU_output);
try {
    double2 *CPU_test = test_GPU_output.host<double2>();
    printf("%f %f\n",CPU_test[0].x,CPU_test[0].y);
} catch (af::exception& e) {
fprintf(stderr, "%s\n", e.what()); 
}

一切都编译并正确运行。

但是，然后我将上述功能更改为

void test_seq(const array& input, array& output, const int N)
{
    array test      = seq(0,N-1);                                         
    output          = input * test;
}

我收到以下运行时错误消息

src/gena/gtypes.cpp:112：错误：从 cuComplex 类型的数组请求 cuDoubleComplex

如果，在另一边，我改变线

double2 *CPU_test = test_GPU_output.host<double2>();

至

float2 *CPU_test = test_GPU_output.host<float2>();

一切都再次运行良好。似乎与使用seq. seq(0,N-1,f64)如果我使用类似的东西（我什至不知道 ArrayFire 是否允许），上述问题不会消失。

我怎样才能继续double2处理并避免降级float2？

score 0 · Accepted Answer

将 seq 转换为数组时，它存储为单精度（浮点）。

目前在arrayfire中，涉及两个不同精度数组的操作的规则是选择较低的精度。这就是input * test从双精度转换为单精度（因此float2）的原因。

现在的解决方案是在 test 的生成下面添加一行。

test = test.as(f64);

它将增加非常少的开销，因为只有在必要时才会生成数组。

parallel-processing - 在 ArrayFire 中使用带有 double2 数组的 seq 时降级为 float2

1 回答 1

Related

Reference