c++11 - Armadillo 中 BLAS 和 OpenBLAS 之间的加速比较

翻译自：https://stackoverflow.com/questions/39814557 2016-10-02T06:46:01.287

47 次

有人知道在犰狳中使用 OpenBLAS 而不是 BLAS 库实现的加速吗？

这是我的结果：我正在尝试将向量乘以犰狳中的矩阵

    #include <iostream>   
    #include<armadillo>
    using namespace std;
    using namespace arma;
int main(int argc, char** argv) {

int num_examples = 100000;
vec randperm(num_examples, fill::zeros);
for(uword i=0;i<num_examples;i++)
    randperm(i) = i;

vec randnum = shuffle(randperm);
randperm.reset();

sp_mat X = sprandu<sp_mat>(num_examples,127,0.8);
mat W(3,127,fill::randn);
wall_clock timer;
double t;

timer.tic();
uword pred_class = (X.row(randnum(10))*W.t()).index_max();
 t= timer.toc();
cout<<"Elapsed time is:"<<t<<endl;
}

这是使用 -lopenblas 的结果：Elapsed time is:0.0374926

仅使用 -armadillo 的结果是：经过的时间是：0.084193

我的机器：Ubuntu 14.04 在 4 个物理内核上运行（启用超线程，8 个内核）

c++11 - Armadillo 中 BLAS 和 OpenBLAS 之间的加速比较

0 回答 0

Related

Reference