2D arrays in CUDA
I've read a lot about working with 2D arrays in CUDA and I think this is necessary to flatten it out before sending it to the GPU. How can I allocate a 1D array to the GPU and get it as a 2D array to the GPU? the code looks like this:
__global__ void kernel( int **d_a )
{
cuPrintf("%p",local_array[0][0]);
}
int main(){
int **A;
int i;
cudaPrintfInit();
cudaMalloc((void**)&A,16*sizeof(int));
kernel<<<1,1>>>(A);
cudaPrintfDisplay(stdout,true);
cudaPrintfEnd();
}
source to share
There is really no need to "flatten" your 2D array before using it on the GPU (although this can speed up memory access). If you need a 2D array, you can use something like the cudaMallocPitch
one described in the CUDA C programming guide. I believe the reason your code is not working is because you are only malloc
editing a 1D array - A [0] [0] does not exist. If you look at your code, you made a 1D array of int
s, not int*
s. If you want to malloc a flattened 2D array, you can do something like:
int** A;
cudaMalloc(&A, 16*length*sizeof(int*)); //where length is the number of rows/cols you want
And then in your kernel use (to print a pointer to any element):
__global__ void kernel( int **d_a, int row, int col, int stride )
{
printf("%p", d_a[ col + row*stride ]);
}
source to share