2D arrays in CUDA

I've read a lot about working with 2D arrays in CUDA and I think this is necessary to flatten it out before sending it to the GPU. How can I allocate a 1D array to the GPU and get it as a 2D array to the GPU? the code looks like this:

__global__ void kernel( int **d_a )
{ 

   cuPrintf("%p",local_array[0][0]);
}

int main(){

    int **A;

    int i;

    cudaPrintfInit();

    cudaMalloc((void**)&A,16*sizeof(int));

    kernel<<<1,1>>>(A);

    cudaPrintfDisplay(stdout,true);

    cudaPrintfEnd();
}

      

+3


source to share


2 answers


This is how I fixed the problem I cudaMalloc in the usual way, but when sending a pointer to the kernel, I cast it to int (*) [col] and it works for me



0


source


There is really no need to "flatten" your 2D array before using it on the GPU (although this can speed up memory access). If you need a 2D array, you can use something like the cudaMallocPitch

one described in the CUDA C programming guide. I believe the reason your code is not working is because you are only malloc

editing a 1D array - A [0] [0] does not exist. If you look at your code, you made a 1D array of int

s, not int*

s. If you want to malloc a flattened 2D array, you can do something like:

int** A;
cudaMalloc(&A, 16*length*sizeof(int*)); //where length is the number of rows/cols you want

      



And then in your kernel use (to print a pointer to any element):

__global__ void kernel( int **d_a, int row, int col, int stride )
{ 
  printf("%p", d_a[ col + row*stride ]);
}

      

+2


source







All Articles