gpgpu - arrayfire 对运行速度非常慢的方程的评估

Question

我一直在从事一个项目，以使用 arrayfire 模拟受生物启发的神经网络。我到了做一些计时测试的地步，对我得到的结果感到失望。我决定尝试使用最快、最简单的模型之一来进行时序测试用例，即 Izhikevich 模型。当我使用该模型运行新测试时，结果更糟。我正在使用的代码如下。它没有做任何花哨的事情。它只是标准矩阵代数。但是，仅对 10 个神经元进行一次方程评估就需要 5 秒以上！之后的每一站也需要大致相同的时间。

代码：

unsigned int neuron_count = 10;

array a = af::constant(0.02, neuron_count);
array b = af::constant(0.2, neuron_count);
array c = af::constant(-65.0, neuron_count);
array d = af::constant(6, neuron_count);

array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);

double tau = 0.2;

void StepIzhikevich()
{
    v = v + tau*(0.04*pow(v, 2) + 5 * v + 140 - u + i);
    //af_print(v);
    u = u + tau*a*(b*v - u);
    //Leaving off spike threshold checks for now
}

void TestIzhikevich()
{
    StepIzhikevich();

    timer::start();

    StepIzhikevich();

    printf("elapsed seconds: %g\n", timer::stop());
}

这是不同数量的神经元的计时结果。

结果：

neurons   seconds
10        5.18275
100       5.27969
1000      5.20637
10000     4.86609

增加神经元的数量似乎不会产生巨大的影响。时间稍微下降了一点。我在这里做错了吗？有没有更好的方法来优化 arrayfire 以获得更好的结果？

当我将 v 方程切换为使用 v*v 而不是 pow(v, 2) 时，一步所需的时间下降到 3.75762。不过，这仍然非常慢，所以发生了一些奇怪的事情。

[编辑] 我试图将处理分成几部分，并发现了一些新的东西。这是我现在使用的代码。

代码：

unsigned int neuron_count = 10;

array a = af::constant(0.02, neuron_count);
array b = af::constant(0.2, neuron_count);
array c = af::constant(-65.0, neuron_count);
array d = af::constant(6, neuron_count);

array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);

array g = af::constant(0.0, neuron_count);

double tau = 0.2;

void StepIzhikevich()
{
    array j = tau*(0.04*pow(v, 2));
    //af_print(j);

    array k = 5 * v + 140 - u + i;
    //af_print(k);

    array l = v + j + k;
    //af_print(l);

    v = l;  //If this line is here time is long on second loop
    //g = l; //If this is here then time is short.

    //u = u + tau*a*(b*v - u);
    //Leaving off spike threshold checks for now
}

void TestIzhikevich()
{
    timer::start();

    StepIzhikevich();

    printf("elapsed seconds: %g\n", timer::stop());

    timer::start();

    StepIzhikevich();

    printf("elapsed seconds: %g\n", timer::stop());
}

当我在没有重新分配回 v 或将其分配给新变量 g 的情况下运行它时，第一次和第二次运行的步骤时间都很短

结果：

经过的秒数：0.0036143

经过的秒数：0.00340621

然而，当我把 v = l; 回来，然后它第一次运行它很快，但从那时起它就很慢。

结果：

经过的秒数：0.0034497

经过的秒数：2.98624

关于造成这种情况的任何想法？

[编辑 2]

我仍然不知道它为什么这样做，但是我找到了一种解决方法，方法是在再次使用之前复制 v 数组。

代码：

unsigned int neuron_count = 100000;

array v = af::constant(-70.0, neuron_count);
array u = af::constant(-20.0, neuron_count);
array i = af::constant(14, neuron_count);

double tau = 0.2;

void StepIzhikevich()
{
    //array vp = v;
    array vp = v.copy();
    //af_print(vp);

    array j = tau*(0.04*pow(vp, 2));
    //af_print(j);

    array k = 5 * vp + 140 - u + i;
    //af_print(k);

    array l = vp + j + k;
    //af_print(l);

    v = l;  //If this line is here time is long on second loop
}

void TestIzhikevich()
{
    for (int i = 0; i < 10; i++)
    {
        timer::start();

        StepIzhikevich();

        printf("loop: %d  ", i);
        printf("elapsed seconds: %g\n", timer::stop());

        timer::start();
    }
}

这是现在的结果。第二次运行它有点慢，但现在它运行之后很快。比以前有了很大的进步。

结果：循环：0经过的秒数：0.657355

循环：1经过的秒数：0.981287

循环：2 秒：0.000416182

循环：3 秒：0.000415045

循环：4经过的秒数：0.000421014

循环：5 秒：0.000413339

循环：6 秒：0.00041675

循环：7 秒：0.000412202

循环：8 秒：0.000473321

循环：9 秒：0.000677432

gpgpu - arrayfire 对运行速度非常慢的方程的评估

0 回答 0

Related

Reference