I am using Windows 7 platform.
I describe below step-by-step all the routines that I perform to get the .dll file (PASS), dyn.load it in R (PASS) and evoking .Call function in R (FAIL).
When evoking .Call I get:
> out<- .Call("rowAND", as.integer(t(m)), nrow(m), ncol(m))
**Error in .Call("rowAND", as.integer(t(m)), nrow(m), ncol(m)) :
C symbol name "rowAND" not in load table**
1) Below the source code:
#include <stdio.h>
#include <math.h>
#include <cuda_runtime.h>
#include <cuda.h>
#include <device_launch_parameters.h>
#include <R.h>
#include <Rdefines.h>
#include "cuPrintf.cuh"
#include "cuPrintf.cu"
#include "cuRow.h"
#include "cuError.h"
extern "C" {
SEXP rowAND(SEXP x, SEXP r_nrow, SEXP r_ncol) {
// input:
// x=as.integer(t(m)), vector of integer values from R (t(m) because store values by col)
// r_nrow=nrow(m), scalar
// r_ncol=ncol(m), scalar
//x = coerceVector(x, INTSXP); // force coercion to a matrix of real values
// define deimension
int nrow = asInteger(r_nrow);
int ncol = asInteger(r_ncol);
size_t m_size;
size_t calc_size;
m_size = nrow * ncol * sizeof(int); // m (input)
calc_size = nrow * sizeof(int); // change to nrow/ncol depending on calculation (output)
// R
SEXP r;
PROTECT(r = allocMatrix(INTSXP,nrow,1));
// cuda error variable
cudaError_t err;
// allocate HOST
int *h_m = INTEGER(x);
int *h_calc = INTEGER(r);
// allocate DEVICE
int *d_m = NULL, *d_calc = NULL;
err = cudaMalloc((void **)&d_m, m_size); checkError(err);
err = cudaMalloc((void **)&d_calc, calc_size); checkError(err);
// copy host matrix to device
err = cudaMemcpy(d_m, h_m, m_size, cudaMemcpyHostToDevice); checkError(err);
// Initialize cuPrintf -- DEBUGGING
cudaPrintfInit();
dim3 numBlocks(nrow,1,1); // blocks
dim3 threadsPerBlock(1,1,1); // 1 thread per block
rowOR<<<numBlocks, threadsPerBlock,0,0>>>(d_m, d_calc, ncol); // main call
// Terminate cuPrintf -- DEBUGGING
cudaPrintfDisplay (stdout, true);
cudaPrintfEnd ();
err = cudaGetLastError(); checkError(err);
// Copy the device result vector in device memory to the host result vector
err = cudaMemcpy(h_calc, d_calc, calc_size, cudaMemcpyDeviceToHost); checkError(err);
// Free device global memory
err = cudaFree(d_m); checkError(err);
err = cudaFree(d_calc); checkError(err);
// Reset the device
err = cudaDeviceReset();
UNPROTECT(1);
return r;
}
2) I compile .cu file, using nvcc which generates the object (.obj). Thus, I link the libraries (PASS), no problem here, and it generates .dll file.
3) when I load the .dll using the R command: dyn.load IT PASS.
The loaded .dll appears in getLoadedDLLs()
:
> getLoadedDLLs()
Filename Dynamic.Lookup
base base FALSE
methods C:/Revolution/R-Community-6.2/R-2.15.3/library/methods/libs/i386/methods.dll FALSE
Revobase C:/Revolution/R-Community-6.2/R-2.15.3/library/Revobase/libs/i386/Revobase.dll TRUE
tools C:/Revolution/R-Community-6.2/R-2.15.3/library/tools/libs/i386/tools.dll FALSE
grDevices C:/Revolution/R-Community-6.2/R-2.15.3/library/grDevices/libs/i386/grDevices.dll FALSE
stats C:/Revolution/R-Community-6.2/R-2.15.3/library/stats/libs/i386/stats.dll FALSE
cuRow C:/Users/msn/Documents/Visual Studio 2010/Projects/R_C/R_C/Debug/cuRow.dll TRUE
4) HERE COMES THE PROBLEM: When I check if the function rowAND is loaded I get FALSE:
> is.loaded("rowAND")
[1] FALSE
>
Thus, obviously, it fails when I run .Call (because it is not loaded):
> path.dll<-'C:/Users/msn/Documents/Visual Studio 2010/Projects/R_C/R_C/Debug'
> dyn.load(file.path(path.dll,paste0("cuRow", .Platform$dynlib.ext)))
> nrow<-10
> ncol<-3
> m<-matrix(sample(c(0,1),nrow*ncol,replace=TRUE),nrow,ncol)
> out<- .Call("rowAND", as.integer(t(m)), nrow(m), ncol(m))
Error in .Call("rowAND", as.integer(t(m)), nrow(m), ncol(m)) :
C symbol name "rowAND" not in load table
I see that the function appears to be correctly defined in the source code, but it can't be "seen" in the loaded library.
What I am missing here? Thanks in advance!
EDIT:
Based on @Dirk partial answer, will try to write a CUDA dll project which will be called by C. Thus, I can compile the target C source using standard R CMD SHLIB.
like: C (dll), deployed to R which calls CUDA dll inside.
will update when done!
EDIT 2:
I answered my own question below. I finally could get CUDA
implementation in R
(WINDOWS platform
)