1

I try to define an array of functions and pass to the map, which I defined as a class, then if my device is CPU, the execution of each of my functions over the vector goes through the CPU, if not, goes through the SYCL part and execute on GPU. At the same time, I want to measure the performance of my code, the performance of executing my code on CPU with TBB library. The problem is, when I compile my program it shows me this error and I do not know how can I fix it. I should point out that, before defining an array of functions, with definition of local copy of function in the SYCL part, my code works well and I had the result, but after defining an array of function, it troughs an error. my code:

    #include <CL/sycl.hpp>
#include <iostream>
#include <tbb/tbb.h>
#include <tbb/parallel_for.h>
#include <vector>
#include <string>
#include <queue>
#include<tbb/blocked_range.h>
#include <tbb/global_control.h>
#include <chrono>


using namespace tbb;

template<class Tin, class Tout, class Function>
class Map {
private:
    Function fun;
public:
    Map() {}
    Map(Function f):fun(f) {}


    std::vector<Tout> operator()(bool use_tbb, std::vector<Tin>& v) {
        std::vector<Tout> r(v.size());
        if(use_tbb){
            // Start measuring time
            auto begin = std::chrono::high_resolution_clock::now();
            tbb::parallel_for(tbb::blocked_range<Tin>(0, v.size()),
                        [&](tbb::blocked_range<Tin> t) {
                    for (int index = t.begin(); index < t.end(); ++index){
                        r[index] = fun(v[index]);
                    }
            });
            // Stop measuring time and calculate the elapsed time
            auto end = std::chrono::high_resolution_clock::now();
            auto elapsed = std::chrono::duration_cast<std::chrono::nanoseconds>(end - begin);
            printf("Time measured: %.3f seconds.\n", elapsed.count() * 1e-9);
            return r;
         } else {
                sycl::queue gpuQueue{sycl::gpu_selector()};
                sycl::range<1> n_item{v.size()};
                sycl::buffer<Tin, 1> in_buffer(&v[0], n_item);
                sycl::buffer<Tout, 1> out_buffer(&r[0], n_item);
                gpuQueue.submit([&](sycl::handler& h){
                    //local copy of fun
                    auto f = fun;
                    sycl::accessor in_accessor(in_buffer, h, sycl::read_only);
                    sycl::accessor out_accessor(out_buffer, h, sycl::write_only);
                    h.parallel_for(n_item, [=](sycl::id<1> index) {
                        out_accessor[index] = f(in_accessor[index]);
                    });
                }).wait();
         }
                return r;
    }
};

template<class Tin, class Tout, class Function>
Map<Tin, Tout, Function> make_map(Function f) { return Map<Tin, Tout, Function>(f);}


typedef int(*func)(int x);
//define different functions
auto function = [](int x){ return x; };
auto functionTimesTwo = [](int x){ return (x*2); };
auto functionDivideByTwo = [](int x){ return (x/2); };
auto lambdaFunction = [](int x){return (++x);};

int main(int argc, char *argv[]) {

    std::vector<int> v = {1,2,3,4,5,6,7,8,9};

    //Array of functions
    func functions[] =
        {
            function,
            functionTimesTwo,
            functionDivideByTwo,
            lambdaFunction
        };
    for(int i = 0; i< sizeof(functions); i++){
        auto m1 = make_map<int, int>(functions[i]);
        std::vector<int> r = m1(true, v);
        //print the result
        for(auto &e:r) {
            std::cout << e << " ";
         }
    }
    

  return 0;
}

The ERROR:

SYCL kernel cannot call through a function pointer
4

2 回答 2

1

You shouldn't forget that you are targeting various devices, each executing its own version of your function.

First, using a function pointer makes no sense as it has only one value and it cannot represent both the CPU and GPU version of the same function.

Second, if you're using a function that is not consteval to get your function pointers and you're using a programming model that does not require to declare your function as a kernel (like CUDA), then I don't see how the compiler could possibly determine which functions to compile for your different devices.

于 2021-06-06T10:05:19.343 回答
0

You can't. Quoting the SYCL 2020 specification:

SYCL device code, as defined by this specification, does not support virtual function calls, function pointers in general, exceptions, runtime type information or the full set of C++ libraries that may depend on these features or on features of a particular host compiler.

于 2021-05-10T19:56:38.167 回答